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PREFACE 

This  report  contains  the  papers  presented  at  the  conference  on  Small-Area  Statistics  in  Chicago,  111. 
on  August  17,  1977,  during  two  sessions  of  the  annual  meeting  of  the  American  Statistical  Association 
(ASA).  Both  sessions  were  sponsored  by  the  ASA  Committee  on  Small-Area  Statistics,  the  ASA  Business 
and  Economic  Statistics  Section,  and  the  ASA  Social  Statistics  Section. 

The  first  session  was  organized  and  chaired  by  Evelyn  Mann  of  the  New  York  City  Department  of 
Planning.  The  speakers  were  Richard  Engels,  Roger  Herriot,  Earic  Gerson,  and  Charles  Troob. 

The  second  session  was  organized  and  chaired  by  Edward  Spar  of  Bill  Communications.  The  speakers 
were  Edwin  Coleman,  Martin  Ziegler,  Robert  W.  Schiedel,  and  Wray  Smith.  Also  presented  was  a  paper 
by  Charles  Hicks  and  Peter  Sailer. 

This  report  was  organized  and  prepared  under  the  direction  of  Alice  Winterfeld,  Chief,  Geographic 
Statistical  Areas  Branch,  Geography  Division,  Bureau  of  the  Census. 
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Introduction 

Evelyn  S.  Mann 
New  York  City  Department  of  Planning 


Welcome  to  the  first  of  two  sessions  of  the  Conference  on 
Small-Area  Statistics.  These  two  interrelated  sessions  have  been 
planned  by  the  American  Statistical  Association  Committee  on 
Small-Area  Statistics  and  have  been  sponsored  jointly  by  the 
Committee,  the  Business  and  Economics  Statistics  section,  and 
the  Social  Statistics  section. 

Since  session  No.  158  this  afternoon  at  2:00  P.M.  has  been 
conceived  as  a  continuation  of  this  morning's  session,  there  will 
only  be  one  discussant  to  cover  both.  Wray  Smith  of  Depart- 
ment of  Health,  Education,  and  Welfare  (HEW)  has  agreed  to 
take  on  this  herculean  task. 

Why,  you  might  ask,  do  we  need  another  session  on  current 
estimates  and  surveys?  At  this  very  annual  ASA  conference 
there  are  sessions  which  cover  aspects  of  this  topic.  There  has 
been  a  session  on  coverage  issues  on  CPS;  on  data  from  the 
Annual  Housing  Surveys;  on  the  Survey  of  Income  and 
Education;  and  on  Survey  Design  Innovations  at  the  Bureau  of 
the  Census,  to  name  just  a  few. 

The  committee  on  Small-Area  Statistics  has  observed  over 
the  past  few  years,  as  Federal  survey  programs  proliferate,  a 
growing  confusion  and  dismay  on  the  part  of  small-area  data 
users.  In  published  reports  and  on  tape  files  there  are  varied 
current  estimates  for  the  same  time  periods  resulting  from 
different  surveys  and  estimates.  In  addition  to  just  total 
population,  variations  have  been  called  to  our  attention  on  such 
characteristics  as  income,  age,  employment,  etc. 

The  most  obvious  example  are  the  three  separate  estimates 
the  Bureau  of  the  Census  publishes  for  July  1  of  each  year-a 
provisional  estimate,  followed  by  an  estimate  devised  for  general 
revenue  sharing  (accompanied  by  a  per  capita  income  estimate), 
followed  by  a  final  estimate  issued  under  the  Federal/State 
cooperative  program. 


The  issues  with  regard  to  the  various  Federal  formula  funded 
programs  have  been  discussed  in  previous  sessions  sponsored  by 
this  Committee,  particularly  those  related  to  general  revenue 
sharing  and  to  community  development. 

Less  well  known  are  the  problems  that  these  multiple 
estimates  cause  for  those  involved  in  the  complex  planning  and 
evaluation  programs  sponsored  by  the  Federal  government 
throughout  the  country.  The  regulations  require  that  areas 
engaged  in  701  housing  and  land  use  contracts,  208  water 
quality  studies,  in  air  quality  and  environmental  studies  all  use 
the  same  current  estimates  and  forecasts  in  their  evaluation. 
Problems  arise  on  how  to  coverge  upon  a  single  set  of  figures. 

The  Committee  has  asked  that  each  of  the  speakers  in  the 
two  sessions  today  integrate  the  following  points  in  their 
presentations: 

1.  What  is  the  implication  of  the  data  they  produce  for 
users? 

2.  How  does  their  estimate  or  forecast  compare  to  similar 
ones  produced  elsewhere  within  the  same  agency  or  in 
other  agencies? 

3.  How  does  the  input  used  by  them,  by  either  estimates 
or  forecasts,  differ  from  other  agencies? 

4.  What  is  the  appropriate  use  of  their  estimates  or 
forecasts?  Are  their  products  being  used  inappropriately? 

5.  Is  the  methodology  being  used  adequate?  If  not,  what 
improvements  are  planned? 

6.  Is  the  level  of  detail  produced  adequate?  If  there  were 
staff,  time,  and  funds  would  the  program  be  expanded? 


ESTIMATES,  SURVEYS,  AND  FORECASTS 


Ties  Among  Federally  Produced 
Population  Estimates  and  Projections^ 

Richard  A.  Engels 
Bureau  of  the  Census 

The  intent  here  is  a  general  review  of  Federal  agency  activi- 
ties in  population  estimates  and  projections,  with  identification 
of  the  linkages  between  the  agencies  and  the  data  series  in- 
volved. Population  levels  are  no  longer  an  issue  of  idle  curiosity 
and  civic  pride  which  serve  occasionally  as  a  research  measure 
and  clutter  little-read  community  plans.  The  population  variable 
is  named  frequently  in  legislation  and  is  often  incorporated  in 
the  design  of  administrative  structures  as  a  reasonably  accessible 
proxy  in  the  demand  for  services  in  the  implementation  of  fund 
distribution  programs.  Over  100  currently  active  Federal  laws 
rely  upon  population,  at  least  in  part,  for  their  operation.' 
The  Housing  and  Community  Development  Act  of  the  Depart- 
ment of  Housing  and  Urban  Development  (HUD),  the  Com- 
prehensive Employment  and  Training  Act  of  the  Employment 
and  Training  Administration  (ETA),  and  General  Revenue 
Sharing  of  the  Department  of  the  Treasury  rely  on  population 
as  one  of  the  factors  to  distribute  $3.5  billion,  $7.0  billion, 
and  $7.5  billion  annually,  respectively. 

Many  of  the  programs  specify  census  counts  as  being  re- 
quired for  use,  and  others  are  sufficiently  demanding  in  terms 
of  detailed  population  characteristics  that  only  the  decennial 
census  results  are  comprehensive  enough  to  satisfy  the  formula 
requirements.  However,  there  are  24  programs  in  which  total 
population  estimates  are  legislated  for  use  in  conjunction  with 
other  factors  to  allocate  funds.  Eleven  of  these  programs  rely 
totally  upon  population  as  the  only  background  information 
source  in  the  distribution  process. 

Methods  Overview 

Although  a  detailed  understanding  of  population  estimating 
methods  in  use  is  not  critical  to  an  appreciation  for  the  Federal 
population  data  system,  a  brief  background  will  assist  in  tracing 
the  estimates  through  the  Census  Bureau  and  in  identifying 
timing  problems  with  the  data  needs  of  other  agencies.  In  the 
fall  of  each  year,  population  estimates  for  individual  States 
are  developed  for  July  of  that  same  year.  The  estimates  are 
provisional,  with  revised  figures  being  produced  at  the  same 
time  for  the  previous  year. 

The  revised  estimates  at  the  State  level  are  developed  by 
averaging  the  results  of  three  methods.  Since  April  1970,  these 
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'Library  of  Congress  Congressional  Research  Service,  Federal  For- 
mula Grant-in-Aid  Programs  that  Use  Population  as  a  Factor  in  Allo- 
cating Funds  (Washington,  D.C.,  U.S.  Government  Printing  Office,  1975). 


methods  use  current  data  to  estimate  population  change.  The 
methods  include:  (1)  Component  Method  II,  which  employs 
vital  statistics  to  measure  natural  increase  and  elementary 
school  enrollment  data  to  estimate  net  migration;  (2)  Ratio- 
Correlation  method,  where  a  multiple  regression  estimating 
equation  is  applied  to  the  changes  in  the  distribution  of  four 
different  series  of  data  to  estimate  changes  in  population,  and 
(3)  Administrative  Records  method,  where  net  internal  migra- 
tion is  estimated  by  using  individual  income  tax  returns.  Immi- 
gration from  abroad  is  developed  separately  from  reports  on 
intended  residence  of  immigrants,  and  vital  statistics  are  used 
to  estimate  natural  increase. 

All  three  methods  are  used  only  to  estimate  the  non-group 
quarters  population  under  age  65.  The  population  65  and  over 
is  estimated  by  adding  the  1970  census  of  population  aged 
65  and  over  and  the  estimated  change  in  the  number  of  people 
enrolled  under  "Medicare"  (the  hospital  and/or  medical  in- 
surance program  under  title  XVIII  of  the  Social  Security  Act) 
between  April  1,  1970  and  the  estimate  date.  A  separately 
maintained  series  of  figures  on  institutional  population  is 
added  to  develop  total  population. 

In  Component  Method  II,  the  procedure  for  estimating  the 
non-group  quarters  population  under  age  65  involves:  (1) 
Subtracting  an  estimate  of  group  quarters  population  on 
April  1,  1970  from  the  1970  census  of  population  that  would 
be  under  age  65  on  the  estimate  date.  For  July  1,  1975  this 
would  be  the  population  under  age  59.75  on  April  1,  1970, 
(2)  adding  births  for  the  period  between  the  1970  census  of 
population  and  the  estimate  date,  (3)  deducting  an  allowance 
for  deaths  (civilian  plus  military)  occurring  in  this  period 
from  the  population  which  would  be  under  age  65  on  the 
date  of  estimation,  (4)  adding  an  estimate  of  non-group 
quarters  net  migration  during  the  period  to  the  population 
that  would  be  under  age  65  on  the  estimate  date,  and  (5) 
adding  an  estimate  of  net  movement  between  the  civilian 
population  and  the  Armed  Forces  (separations  minus  induc- 
tions plus  military  deaths)  during  the  period. 

In  determining  an  estimate  of  non-group  quarters  migration, 
net  migration  for  children  between  exact  ages  6.50  and  14.49 
on  the  estimate  date  for  each  postcensal  period  ending  July  1, 
is  developed  first  on  the  basis  of  age  data  from  the  1970  census 
together  with  fall  school  enrollment  data  for  elementary  grades 
1  to  8  for  the  year  1969  and  each  school  year  thereafter.  The 
amount  of  net  migration  for  school  children  in  these  ages  is 
converted  to  a  migration  rate  for  the  entire  non-group  quarters 
population  under  65.  These  estimates  of  net  migration  and  net 
migration  rates  relate  to  various  postcensal  periods  and  to 
cohorts  with  the  indicated  ages  on  the  estimate  date. 

In  the  Ratio-Correlation  method,  the  percent  change  in  the 
Slate  distribution  of  four  symptomatic  variables  (the  number 
of  students  enrolled  in  elementary  school,  the  number  of 
Federal  income  tax  returns,  the  number  of  registered  passenger 
cars,  and  the  number  of  persons  in  the  work  force)  from  1970 
to  the  estimate  year  is  used  to  estimate  the  percent  change  in 
the  State  distribution  of  the  non-group  quarters  population 
under  age  65  from  1970  to  the  estimate  year.  First,  the  percent 
change   in   the   State  distribution  of  the  population  between 
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1970  and  the  estimate  year  is  derived  by  the  use  of  a  stepwise 
linear  estimating  equation  based  on  the  relationships  between 
four  symptomatic  variables  and  population  for  1960  and  1970 
in  combination  with  current  data  for  the  symptomatic  variables. 
Second,  this  estimated  percent  change  in  the  States'  distribution 
of  population  is  in  turn  multiplied  by  the  share  of  the  United 
States  population  under  age  65  which  the  State  had  in  1970. 
This  yields  a  preliminary  estimate  of  the  State  distribution  of 
the  population  under  age  65  in  the  estimate  year.  Third,  the 
figures  in  the  preliminary  distribution  arc  adjusted  propor- 
tionately to  sum  to  100  percent.  Finally,  these  distributions  are 
applied  to  an  independent  national  estimate  of  the  nongroup 
quarters  population  under  age  65  in  the  estimate  year. 

The  Administrative  Records  method  is  a  component  method 
which  uses  exemptions  on  individual  Federal  income  tax  returns 
to  measure  civilian  migration  and  reported  birth  and  death 
statistics  to  estimate  natural  increase.  The  tax  returns  are 
matched  by  Social  Security  number  in  the  base  year  and  the 
estimate  year  to  determine  the  number  of  persons  whose 
county  of  residence  changed  during  the  period.  A  net  migration 
rate  based  on  exemptions  claimed  by  the  matched  cases  is  then 
applied  to  the  total  population.  This  estimate  is  made  specific 
to  the  non-group  quarters  population  under  65  by  excluding 
the  migration  computations  data  relating  to  persons  65  years 
and  over.  These  estimates  are  then  combined  with  independent 
estimates  of  the  population  65  and  over  based  on  medicare 
statistics.  The  other  components  of  civilian  population  change- 
births,  deaths,  immigration,  and  the  net  movement  between 
the  Armed  Forces  and  civilian  population— are  identical  with 
Component  Method  II. 

Provisional  estimates  for  States  are  prepared  utilizing  only 
Component  Mettiod  II  (with  partial  data)  and  a  two-variable 
form  of  the  Ratio-Correlation  method. 

Each  summer,  county  population  estimates  are  developed 
relying  upon  the  same  basic  procedures  as  outlined  for  the 
States,  with  minor  accommodations  and  additions  dictated 
by  data  availability.  For  instance,  the  estimating  equations 
and  variables  involved  are  specific  to  each  State  corresponding 
to  data  availability  and  the  characteristics  unique  to  individual 
areas.  Similarly,  special  censuses  are  much  more  frequent  for 
counties  and  local  areas  than  for  States,  and  are  frequently 
relied  upon  for  a  more  current  benchmark  upon  which  to 
construct  an  estimate.  Provisional  estimates  in  the  case  of 
counties  are  constructed  using  only  Method  II  as  a  measure 
of  change. 

Finally,  each  fall,  updated  estimates  are  prepared  for  all 
cities,  towns,  and  minor  civil  divisions  in  the  United  States 
using  only  the  Administrative  Records  method.  Again,  some 
modifications  of  detail  are  necessary  to  adapt  the  method  to 
provide  local  estimates  (e.g.,  the  lack  of  available  medicare 
statistics  below  the  county  level),  but  the  general  operations 
and  steps  of  the  model  remain  unchanged  from  the  system 
utilized  at  higher  levels  of  geography.  Again,  revisions  of 
figures  for  earlier  years  are  common  and  are  based  upon  more 
current  background  data,  including  some  modifications  in 
procedures. 


Internal  Relationships 

At  any  one  point  in  time,  what  is  the  inventory  of  available 
total  population  estimates?  Perhaps  the  most  meaningful  point 
of  reference  is  in  the  fall,  when  estimates  for  both  State  and 
local  communities  are  prepared.  In  the  fall  of  1977,  the  re- 
lationships shown  in  table  1  would  apply. 

Table  1.   Timing  and  Status  of  Census  Bureau  1976 
Population  Estimates  Available  Fall  1977 


Area 

Status  of 
estimates 

Months 

until 

revised 

States 

Revised  .... 
Interim  .... 
Provisional 

Counties 

6 

Cities,  towns,  and  MCD's 

12 

The  "interim"  designation  for  counties  is  a  complication 
introduced  by  the  particular  timing  of  the  estimates.  In  the 
interests  of  prompting  both  consistency  and  improved  ac- 
curacy, the  estimates  are  adjusted  to  conform  to  figures  for 
the  next  higher  level  of  geography.  In  the  fall,  sufficient  data 
may  be  obtained  to  prepare  estimates  by  Method  II  and  the 
Administrative  Records  method.  Although  this  interim  figure 
may  result  in  confusion  on  the  part  of  census  data  users  familiar 
with  the  provisional  and  revised  designations,  it  is  felt  that  the 
introduction  of  a  second  estimating  technique  provides  a 
significant  additional  measure  of  reliability  over  the  provisional 
numbers  sufficient  to  warrant  the  public  release  of  a  further 
series  of  figures. 

A  second  feature  of  internal  Census  Bureau  procedures  may  be 
an  additional  source  of  confusion  for  users  of  our  materials,  and 
is  an  element  of  our  operations  that  needs  attention.  The  issue  is 
the  blending  of  survey  results  with  independently  derived 
estimates  and  the  potential  for  conflicting  results  from  the  two 
approaches.  Table  2  summarizes  the  types  of  control  level 
estimates  developed  by  independent  non-survey  techniques  to 
serve  as  the  basis  for  expanding  sample  results  to  full  universe 
levels.  Although  the  list  is  extensive,  there  is  frequent  oppor- 
tunity for  "uncontrolled"  survey  results  to  disagree  with  other 
independently  produced  estimates.  For  example,  more  detailed 
estimates  of  population  by  age  and  race  are  available  by  State 
than  are  utilized  in  the  survey  controls  process. Most  often,  the 
survey  results  and  independent  estimates  agree  within  sampling 
variance  and  do  not  differ  significantly  as  to  magnitude  or 
pattern.  Nonetheless,  variations  in  figures  released  by  the  same 
agency  are  often  misunderstood  and  are  not  tolerated  by  the 
average  user  of  census  materials.  The  degree  to  which  both  the 
estimates  and  the  survey  findings  may  be  pushed,  the  means  of 
establishing  the  line  between  the  detail  of  the  survey,  and  the 
benefits  of  tying  to  independent  estimates  series  for  broader 
characteristics  have  yet  to  be  established.  However,  the  points 
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Table  2.   Sub-National  Survey  Controls 


Survey 

Level 

Frequency 

Years 
provided 

Years 
pending 

Date  due 

Current  population  survey 

Regions,  States,  171  SMSA ' s 
6  city,  county  equivalents 

Annual 

1970-1976 

1977 

December  10 

CPS-voting  age  supplement.... 

Selected  States  and  SMSA ' s 

Biennial 

1970,72,74 

1976 

August 

CPS-Spanlsh  supplement 

Selected  States 

Annual 

1975,1976 

1977 

November 

Comprehensive  Employment  and 

training  Act  (testing) 

States 

Annual 
Monthly 

1973-1974 
None 

1977 
1977 

October 

Survey  of  income  and  education. 

States 

10  largest  states 

Once 
Annual 

1976 
1973-1976 

None 

1977 

National  crime  survey 

June  1978 

Annual  housing  survey 

20  SMSAs,  control  cities, 
and  balance  of  SMSA 

Annual 

1974,1975 

1976 

Summer  197  7 

Department  of  transportation 

supplement 

20  SMSA's,  central  cities, 
and  balance  of  SMSA 

Annual 

1975-1976 

Revised  1976 

February  1978 

Voting  rights  act 

Places  and  counties 

Biennial 

1972.1976 

1978 

March  1978 

of  contact  should  be  expected  to  shift  over  time  as  develop- 
ments occur  in  both  types  of  data  resources,  and  must  be 
updated  on  a  continuing  basis  given  both  the  limitations  of 
survey  procedures  and  the  accuracy  levels  demanded  of  the 
controls. 

External  Overlap 

In  recent  years,  little  overlap  has  occurred  with  other  agen- 
cies in  the  production  and  application  of  population  estimates. 
Both  the  letter  and  spirit  of  the  Office  of  Management  and 
Budget  Circular  A-46  "Standard  Data  of  Total  Population 
Used  in  Distributing  Federal  Benefits"  (revised)  governing 
the  use  of  total  population  estimates  in  the  distribution  of 
Federal  funds  appear  to  have  been  follov^^ed.  Of  course,  the 
circular  does  not  cover  population  characteristics,  thereby 
permitting  the  administering  agency  to  revert  to  the  1970 
census  or  work  whatever  accommodations  on  the  most  re- 
cently  detailed   information   as  are  reasonable  and  acceptable. 

In  most  cases  where  a  nonmatch  occurs  between  the  pro- 
visions of  the  circular  and  the  population  figures  used  for 
allocations,  the  noncompliance  is  merely  a  function  of  poor 
timing  or  a  lack  of  awareness  that  current  figures  have  been 
released.  For  example,  recent  Economic  Development  Ad- 
ministration allocations  for  Local  Public  Works  grants  were 
based  upon  1973  population  estimates  for  cities  and  towns. 
However,  only  6  months  prior  to  the  distributions,  figures 
updated  to  1975  were  completed,  but  the  timing  was  such 
that  use  of  the  more  current  figures  was  prohibited.  Similarly, 
the  Legal  Services  Corporation,  a  quasi-Federal  agency  re- 
sponsible for  legal  defense  grants-in-aid  to  local  areas  and 
U.S.  territorial  possessions,  has  continued  to  appropriate  funds 


to  the  Trust  Territories  of  the  Pacific  on  the  basis  of  1970 
population  counts  because  it  was  not  aware  that  the  Census 
Bureau  might  recognize  a  locally  conducted  1973  census  and 
in  fact  is  using  the  1973  results  as  a  foundation  for  1975  esti- 
mates. In  contrast,  most  agencies  subscribe  closely  to  the  use 
of  the  most  current  esfimates  available  and  make  it  common 
practice  to  rely  upon  them  in  program  planning  applications 
and  auxiliary  data  series  development  (e.g.,  as  a  denominator 
in  the  Bureau  of  Economic  Analysis  estimates  of  county  per 
capita  income),  as  well  as  in  fund  allocations. 

While  overlap  in  the  production  of  total  population  estimates 
with  other  Federal  agencies  is  not  an  issue,  widespread  overlap 
does  take  place  with  non-Federal  agencies.  There  is  scattered 
duplication  at  the  State  and  county  levels  even  within  our  own 
Federal-State  Cooperative  Program  for  local  population  esti- 
mates (FSCP).  Again  this  is  due  largely  to  problems  of  data 
lags  and  timing.  One  of  the  primary  objectives  of  the  FSCP  is 
to  bring  together  the  States  and  the  Census  Bureau  on  popula- 
tion estimates.  In  some  of  our  estimating  work,  we  obtain  data 
directly  from  the  primary  collecting  agency  and  the  population 
figures  are  developed  independently  by  us.  For  most  purposes, 
an  intermediary  State  agency  in  the  FSCP  is  involved  that 
assembles  the  necessary  data,  examines  the  material  for  quality 
and  consistency,  reviews  the  resulting  estimates,  and  in  some 
cases  prepares  county  and  local  area  estimates  themselves  to  be 
adopted  by  us.  Frequently  the  locally  derived  estimates  are 
prepared  to  satisfy  State  level  legislation,  often  requiring 
completed  figures  well  in  advance  of  ours. 

In  other  States  and  local  areas,  estimates  are  developed 
that  have  no  ties  with  the  FSCP  program.  The  methods  used 
to  construct  the  estimates  have  not  been  subjected  to  a  com- 
prehensive testing  program,  and  the  dialogue  with  the  Census 
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Bureau  estimates  program  reflects  healthy  professional  dis- 
agreement until  such  evaluations  can  be  performed.  This  in- 
cludes estimates  usually  developed  by  county  and  city  planning 
agencies  using  the  housing  unit  method,  in  vi'hich  insufficient 
attention  is  devoted  to  the  maintenance  of  an  accurate  housing 
stock,  the  changing  nature  of  housing  in  the  inventory,  current 
vacancy  levels,  and  the  secular  reduction  in  population  per 
household.  The  professional  impass  and  the  accompanying 
fiscal  pressures  on  local  officials  in  light  of  the  impact  of  popu- 
lation figures  on  city  fund  allocations  has  recently  resulted 
in  resolutions  by  the  U.S.  Conference  of  Mayors  for  the  use  of 
".  .  valid  measures  (and)  methods.  .  ."  in  preparing  current 
estimates.^  The  disagreements  with  local  areas  on  levels  and  the 
direction  of  population  change  have  been  partially  responsible 
for  Federal  legislation  introduced  recently  as  the  "Census 
Reform  Act  of  1977"  calling  for  the  arbitration  of  disputes  over 
estimates  by  the  U.S.  Congress. 

This  is  not  to  say  that  the  Census  Bureau  estimates  for  local 
areas  are  trouble-free  by  any  means.  Table  3  presents  the 
results  of  evaluations  relating  Census  Bureau  estimates  for  local 
areas  to  approximately  2,000  special  censuses  conducted 
around  1975.  While  the  estimates  for  larger  areas  (10,000  and 
over)  are  well  within  the  acceptable  range,  the  results  for 
areas  of  5,000  to  10,000  population  are  questionable,  and  the 
findings  for  smaller  areas  indicate  substantial  room  for  im- 
provement. Alternative  estimating  techniques  and  review  pro- 
cedures for  these  types  of  areas  are  under  consideration,  and 
serious  attempts  to  improve  the  estimates  by  local  agencies  or 
research  groups  are  welcomed. 

Projections 

Many  of  the  conditions  present  during  the  late  1960's  and 
the  early  1970's  that  brought  about  such  firm  legislative  and 
administrative  action  on  population  estimates  are  also  present 
now  with  respect  to  population  projections.  For  example, 
the  proliferation  of  projections  in  recent  years  has  encouraged 
a  concern  for  standardization  of  methods,  data,  philosophy 
of  projections,  and  their  underlying  assumptions.  A  recent 
newspaper  article  on  projections  for  Fairfax  County,  Virginia 
is  a  case  In  point.  No  less  than  eight  different  sets  of  projections 
were  identified  by  the  reporter  as  having  been  produced  in  the 
last  4  years  covering  a  wide  range  of  figures  completed  by  a 
variety  of  governmental  agencies  and  consulting  firms. 

The  uses  of  such  projections  have  been  altered  from  a  casual 
exercise  in  background  information  to  planning  applications, 
and  more  recently  to  a  central  position  in  the  qualification  for 
funding  by  Federal  projects.  To  date,  no  Federal  legislation 
requires  the  use  of  projections  for  qualification  or  for  use  in 
distribution  equations.  The  Environmental  Protection  Agency 
and  the  Department  of  Transportation  do  fund  projects  and 
distribute  funds  on  the  basis  of  projections,  but  this  is  a  result 
of  program  administration  rather  than  specific  legislative  pro- 


visions. Other  currently  active  legislation  provides  for  or  en- 
courages the  development  of  population  projections  as  a  part 
of  the  programs,  but  does  not  base  funding  on  them  directly. 
For  example,  much  of  the  Housing  and  Urban  Development 
planning  activity  supported  by  701  planning  grants  included 
work  on  population  and  related  projections.  The  Housing  and 
Community  Development  Act  of  1974  (PL  93-383)  continues 
that  emphasis  and  similarly  stipulates  that  population  projec- 
tions should  be  a  core  item  in  the  comprehensive  planning 
process  supported  by  funding  under  the  Act. 

An  even  stronger  provision  is  contained  in  the  recent  State  and 
area  health  planning  and  resources  development  legislation  of 
1974  (PL  93-641 ).  Health  planning  data  at  the  county  level  and 
below  are  required,  including  population  projections.  The  re- 
quirements arc  sufficiently  demanding  and  the  available  data  re- 
sources are  so  scanty,  however,  that  the  requirements  have 
been  waived  for  a  2-year  period.  Similarly,  the  Employment 
and  Training  Administration  is  promoting  an  extensive  man- 
power planning  program  for  all  Comprehensive  Employment 
and  Training  Act  (CETA)  prime  sponsors  that  relies  upon  a 
massive  foundation  of  background  information,  Including 
population  projections.  Recognizing  the  lack  of  such  material, 
ETA  is  constructing  their  own  data  generation  and  projection 
capability  through  the  Lawrence  Berkeley  Laboratories  and 
will  provide  the  material  to  the  local  CETA  agencies. 

In  a  recent  Census  Bureau  survey  of  groups  preparing  local 
population  projections  for  use  in  satisfying  Federal  legislation, 
approximately  45  city,  county,  and  regional  agencies  were 
identified,  each  one  developing  figures  to  satisfy  an  average  of 
two  pieces  of  Federal  legislation.^  Of  course,  there  is  some 
indication  that  the  agencies  cite  such  legislation  as  support 
and  justification  for  their  research  activities  when  compliance 
is  not  totally  required  and  other  projections  may  be  available 
for  the  same  area.  Nontheless,  Federal  legislation  appears  to 
reflect  considerable  interest  in  local  projections  with  a  com- 
mensurate measure  of  local  activity  taking  place  in  response. 

However,  administrative  practice  has  imposed  some  limita- 
tions on  which  projections  may  be  used.  For  example,  the 
Environmental  Protection  Agency  (EPA)  stipulates  that  county 
and  regional  population  projections  from  the  Bureau  of  Eco- 
nomic Analysis  (BEA)  must  be  relied  upon  in  application 
for  208  planning  grants  for  water  quality  facilities  management. 
Similarly,  administration  of  section  201  provisions  includes 
the  use  of  BEA  figures  as  controls  for  projections  below  the 
county  level,  with  the  actual  development  of  subcounty  projec- 
tions often  being  carried  out  by  private  consulting  groups. 
Variations  from  the  BEA  projections  are  tolerated  only  with 
sufficiently  extensive  documentation  supporting  alternative 
figures.  The  Department  of  Housing  and  Urban  Development 
also  suggests  the  consideration  of  BEA  as  at  least  one  of  the 
series  of  projections  in  local  planning  studies. 

Other  Federally-developed  projections  that  are,  or  will  be, 
available  for  similar  applications  are  Census  Bureau  projections 
for  States,  and  State  and  county  projections  by  the  Department 


'Adopted  Resolutions  of  the  Conference  of  the  Conference  of 
Mayors,  45th  Annual  Meeting,  )une  15,  1977,  New  Resolution  No.  115, 
U.S.  Conference  of  Mayors,  Washington,  D.C. 


'Unpublished  U.S.  Bureau  of  the  Census  tabulations  of  1975  survey 
of  agencies  and  groups  preparing  population  estimates  and  projections. 


ESTIMATES,  SURVEYS,  AND  FORECASTS 


Table  3.   Comparison  of  Administrative  Records  Methods  Estimates  With  Special  Censuses  Conducted  After  July  1,  1974 

for  Subcounty  Areas  by  Size  of  Area  and  Region:   July  1,  1975 

(Base  is  the  special  census  results  adjusted  to  July  1,  1975) 


I  tern 


All 
places 


Size  of  place' 


Under 
500 


500 
to 
999 


1  .000 

to 
4,999 


5.000 

to 
9.999 


10,000 

to 
49,999 


50,000 

or 

more 


United  States 

Number  of  places 

Average  percent  difference^ 

Size  of  difference: 

Less  than  1  percent 

1  percent  to  3  percent 

3  percent  to  5  percent 

5  percent  to  10  percent 

10  percent  or  more 

Where  the  administrative  records  method  estimate  was 

Higher 

Lower 

'Based  on  1970  census. 
^Disregarding  sign. 


2.051 
12.5 


220 
410 
282 
465 
674 


908 
1,143 


454 
18.6 


16 
31 
28 
78 
301 


186 
268 


232 
10.0 


17 
34 
29 
62 
90 


100 
132 


636 
9.3 


60 
116 

88 
178 
194 


267 
369 


251 
7.0 


33 
56 

44 
58 
60 


121 
130 


378 
4.1 


75 
126 
77 
75 
25 


184 
194 


100 
3.2 


19 
47 
16 

14 
4 


50 
50 


of  Energy.  The  Census  projections  are  an  update  of  the  1972 
work,^  with  some  extensions  beyond  the  illustrative  continua- 
tion of  past  trends.  The  Department  of  Energy  project  is  just 
beginning  toward  a  goal  of  modeling  energy  requirements  for 
States.  However,  the  unit  of  analysis  will  be  counties.  The 
projection  framework  will  allow  a  number  of  alternative 
scenarios  and  will  permit  some  analysis  of  local  economic, 
demographic,  and  related  impacts. 

Despite  such  widespread  Federal  activity  in  population 
projections,  there  is  only  minimal  interaction  between  the  BEA, 
the  Departmentof  Energy,  ETA,  and  Census  Bureau  studies,  and 
only  superficial  involvement  of  local  analysts.  The  Department 
of  Energy  project  includes  a  review  and  planning  committee 
comprised  of  representatives  from  other  Federal  agencies.  As 
the  work  evolves,  some  interaction  of  professional  staff  may 
also  occur,  but  there  is  no  guarantee  that  this  will  result  in 
agreement  of  the  projections  with  figures  developed  elsewhere 
in  Federal  government.  Such  professional  contact  has  occurred 
and  is  continuing  at  an  informal  working  level  between  BEA  and 
the  Census  Bureau,  but  again  with  no  formal  assurance  of  direct 
correspondence  between  the  results  of  the  two  projections. 
Although  there  is  occasional  interaction  between  the  ETA  and 
Census  Bureau  staffs,  the  ETA  projections  remain  a  product  of 
that  agency. 

Two  Census  Bureau  projections-related  activities  not  covered 
by  the  1972  work  and  the  update  under  way  now  are:  (1)  A 


guide  for  reviewing  and  preparing  local  area  population  pro- 
jections,^ and  (2)  a  joint  American  Statistical  Association 
(ASA)-Census  Bureau  workshop  on  projections.  The  guide  is 
a  handbook  designed  for  local  planners  and  others  unfamiliar 
with  projection  techniques  to  serve  as  an  aid  in  reviewing  the 
adequacy  of  available  projections  and  as  instructional  material 
for  preparing  projections  where  none  exist.  A  component 
procedure  is  followed  step-by-step  through  the  development 
of  results  as  illustrative  of  one  technique  described  in  the  guide. 
The  document  was  prepared  under  agreement  with  the  Health 
Resources  Administration  as  a  part  of  their  program  to  assist 
local  health  planners  in  meeting  the  data  requirements  of  the 
1974  health  planning  legislation. 

The  joint  workshop  on  migration  modeling  and  projections 
was  a  small  invitational  conference  supported  by  the  National 
Science  Foundation  as  one  of  two  conferences  being  conducted 
in  1977  under  a  3-year  pilot  program  to  assist  in  examining 
"real-world"  statistical  problems.  It  served  as  a  meeting  of  the 
research  community  at  large  with  the  Census  Bureau  as  one 
element  in  a  continuing  research  effort  to  improve  the  social 
science  data  base.  Other  Federal  agencies  and  population 
scholars  were  involved  in  the  workshop  with  the  objectives  of 
(1)  reviewing  and  summarizing  work  to  date  in  this  research 
area,  (2)  evaluating  Census  Bureau  programs  relative  to  the 
current  status  of  the  projections  field,  (3)  determining  whether 
the  development  and  application  of  projection  methods  is  a 


*U.S.  Burcdu  of  the  Census,  Current  Population  Reports,  Series 
P-25,  Number  477  (Washington,  D.C.,  U.S.  Government  Printing  Office, 
March  1972). 


'Richard  Irwin,  Guide  for  Local  Area  Population  Projections,  U.S. 
Bureau  of  the  Census,  Technical  Paper  39  (Washington,  D.C.,  U.S. 
Government  Printing  Office,  1977). 
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high  priority  project  for  continued  ASA  involvement,  and 
(4)  identifying  specific  research  activities  that  nnight  lead  to 
further  joint  involvement  of  the  two  groups.  A  side  benefit, 
of  course,  is  a  current  reading  on  projection  activities  of 
Federal  agencies  as  a  first  step  toward  coordination. 

Summary 

Estimating  methods  in  use  at  the  Census  Bureau  are  re- 
viewed as  background  for  identifying  provisional,  interim,  and 
final  series  of  estimates.  Applications  of  the  figures  as  internal 
controls  for  census  survey  operations  are  summarized  with 
points  of  overlap  highlighted.  Little  conflict  among  population 
estimates  in  use  by  other  Federal  agencies  in  allocation,  plan- 
ning, or  research  applications  is  found,  but  considerable  local 
and  State  activity  is  identified.  Substantially  more  opportunity 
for  conflicting  results  and  a  corresponding  lack  of  coordination 
is  found  in  both  Federal  and  local  projection  work.  The  po- 
tential for  strong  legislative  guidelines  similar  to  those  now  in 
effect  for  estimates  is  felt,  but  will  take  some  measure  of 
insight  and  planning,  with  possible  executive  intervention 
similar  to  that  specified  for  current  estimates  in  Circular  A-46. 
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Exhibit  I 

Circular  No.  A-t6 

Revised 


STANDARD  DATA  SOURCE  OF  TOTAL  POPULATION 
USED  IN  DISTRIBUTING  FEDERAL  BENEFITS 


Purpose:   The  purpose  of  Circular   A-U6,   Exhibit   I   is  to 

assure   use   of   standard  data   on  total  population  for  all 

Federal  programs  which  make  use  of  total  population  data  in 
the  distribution  of  Federal  benefits. 

Current  data :  For  the  purposes  of  this  Circular  the  term 
current  data  means  the  most  current,  complete  national 
series  as  published  by  the  Bureau  of  the  Census  in  "Current 
Population  Reports,"  P-25,  P-26,  or  related  series,  except 
where  data  from  a  decennial  census  conducted  by  the  Bureau 
of  the  Census  is  more  current. 

Use  of  Data  of  Total  Population:  Executive  departments  and 
estabXishments  Tn  distributing  and/or  determining 
eligibility  for  the  benefits  of  an  appropriation  for  a 
single  year  on  the  basis  of  data  on  total  population  shall 
use  data  which  refer  to  the  same  point  or  period  of  time  for 
each  class  of  eligible  government.  The  data  on  total 
population  shall  be  the  most  current  and  comprehensive 
published  by  the  Bureau  of  the  Census.  Where  total 
population  is  used  as  the  denominator  of  a  fraction,  the 
data  for  both  numerator  and  denominator  will  be  the  most 
recent  for  which  both  are  available.  Where  total  population 
is  used  as  the  numerator  of  a  fraction,  the  data  for  both 
numerator  and  denominator  will  be  the  most  recent  for  v;hich 
both  are  available. 

Justification  for  Exception:  Agencies  shall  request 
approval  by  the  Office  of  Management  and  Budget  (OMB)  for 
use  of  data  on  total  population  other  than  the  most  current 
data  published  by  the  Bureau  of  the  Census.  The  request 
v;ill  include  identification  of  the  program(s)  affected, 
legislation  inplemented  by  those  programs,  justification  for 
use  of  alternative  data,  and  a  report  on  consultations  with 
the  Bureau  of  the  Census  in  respect  to  data  sources. 

Any   agency   required   by   legislation  to  use  data  on  total 

population  other  than  those  required  by  this  Circular   shall 

notify  the  Office  of  Management  and  Budget  prior  to  such 
use. 
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CIRCULAR   NO.    A-i|6 
Transmittal   Memorandum  Ho. 


Appendix   A   (continued) 


TO  THE  HEADS  OF  EXECUTIVE  DEPARTMEtlTS  AMD  ESTABLISHi'lENTS 

SUBJECT:   Amendment  to  Circular  No.  A-US  "Standard  Data  of 
Total  Populatiion  Used  in  Distributing  Federal 
Benefits" 


Attached  is  Exhibit  I  which  amends  Circular  No.  A-ue.  The 
purpose  of  this  amendment  is  to  assure  use  of  standard  data 
on  total  population  for  all  Federal  programs  which  make  use 
of  total  population  data  in  the  distribution  of  Federal 
benefits. 


JAMES  T.  LYI.iJ 
DIRECTOR 


Attachment 


Proposed    Legislation: 


legislation 
responsible  f 
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les   at  the  time  the  proposed 

OMB  for  review  pursuant  to 
its  review  process  under  that 
st  additional  information  and 

population  data   and   their 


Report  on  use  of  Data  on  Total  Population:  Executive 
departments  and  establishments  shall  report  by  August  29, 
1975  on  all  programs  the  benefits  of  which  are  distributed 
on  the  basis  of  total  population  or  formulas  including  total 
population  as  an  element.  The  report  shall  separately 
identify  for  each  program: 

a.  Name  of  program  and   Catalog   of   Federal   Domestic 
Assistance  numeric  identifier. 

b.  Statutory  basis. 

c.  Data   elements   specified   in   addition    to   total 
population. 

d.  Source  (s)  of  data  now  used  for  each  data  element. 

e.  Amount  of  funds  distributed   fiscal   year   19714   and 
fiscal  year  1975. 

f.  Date  of  next  distribution. 
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ESTIMATES,  SURVEYS,  AND  FORECASTS 


UPDATING  PER  CAPITA  INCOME  FOR 
GENERAL  REVENUE  SHARING 

Roger  Herriot 
Bureau  of  the  Census 

The  inclusion  of  per  capita  income  in  the  general  revenue 
sharing  legislation  brought  forth  a  new  era  in  income  statistics. 
Prior  to  this  time  the  Census  Bureau  had  never  produced  any 
income  statistics  for  governmental  units,  whose  population  was 
below  2,500,  except  for  counties.  In  accordance  with  the 
legislation,  the  Bureau  produced  1969  per  capita  income 
estimates  for  all  governmental  units  regardless  of  size,  using  data 
from  the  1970  census  20-percent  sample.  These  estimates  have 
been  updated  twice  since  then,  for  1972  and  1974.  This  paper 
describes  the  procedures  used  to  develop  the  updated  estimates. 

The  1974  and  revised  1972  per  capita  income  (PCI)  figure  is 
the  estimated  average  amount  of  money  income  per  person 
received  during  calendar  years  1974  and  1972  for  all  persons 
residing  in  a  given  political  jurisdiction  in  April  1975  and  April 
1973,  respectively.  The  1974  and  revised  1972  PCI  estimates  are 
based  on  the  1970  census  and  have  been  updated  using  rates  of 
change  developed  from  various  administrative  record  sets  and 
compilations,  mainly  from  the  Internal  Revenue  Service  (IRS) 
and  the  Bureau  of  Economic  Analysis  (BEA). 

The  PCI  estimates  are  based  on  a  money  income  concept. 
Total  money  income  is  defined  by  the  Bureau  of  the  Census  for 
statistical  purposes  as  the  sum  of: 

1 .  Wage  and  salary  income 

2.  Net  nonfarm  self-employment  income 

3.  Net  farm  self-employment  income 

4.  Social  Security  and  railroad  retirement  income 

5.  Public  assistance  income 

6.  All  other  sources  of  money  income  such  as  interest, 
dividends,  veteran's  payments,  pensions,  unemployment 
insurance,  alimony,  etc. 

The  total  represents  the  amount  of  income  received  before 
deductions  for  personal  income  taxes.  Social  Security,  bond 
purchases,  union  dues,  medicare  deductions,  etc. 

STATE  AND  COUNTY  METHODOLOGY 

The  updated  per  capita  income  estimates  are  based  on  the 
per  capita  income  figure  for  1969  from  the  1970  census.  The 
State  and  county  estimates  are  updated  by  type  of  income  (the 
categories  classified  in  the  1970  census)  reflecting  changes 
between  1969  and  1974  from  administrative  records  sources, 
i.e..  Federal  tax  returns  and  BEA's  personal  income  data. 

IRS  data  was  tabulated  by  the  Bureau  of  the  Census  using  an 
extract  of  the  1969,  1972,  and  1974  IRS  individual  master  file. 
The  BEA  data  are  developed  from  the  National  income  and 
product  account  system.' 

'  A  detailed  explanation  of  the  derivation  of  the  BEA  money  income 
estimates  is  available  from  the  Bureau  of  Economic  Analysis,  Regional 
Economic  Measurement  division,  U.S.  Department  of  Commerce. 


Wages  and  salaries  at  both  the  State  and  county  level  are 
updated  using  IRS  data.  The  1970  census  aggregate  wage  and 
salary  amount  for  States  is  increased  by  the  percent  change  in 
wages  and  salaries  from  1969  and  1974  as  computed  from  tax 
returns  shown  in  formula  1 . 

At  the  county  level  there  is  a  greater  possibility  of  miscoding 
the  tax  returns.  To  minimize  the  effect  of  this  source  of  possible 
bias  the  county  wage  and  salary  updates  are  done  on  a  per 
capita  basis.  The  1970  census  amount  of  wage  and  salary  per 
capita  is  increased  by  the  percent  change  in  IRS  wages  and 
salaries  per  exemption  from  1969  and  1974.  This  updated  1974 
per  capita  figure  is  then  multiplied  by  the  1975  population 
estimate  to  derive  aggregate  wages  and  salaries  as  shown  in 
formula  2. 

The  remaining  types  of  income  identified  in  the  1970  census, 
i.e.,  nonfarm  and  farm  self-employment.  Social  Security,  public 
assistance,  and  other  income,  are  updated  for  States  and 
counties  using  the  percent  change  in  BEA  estimates  for  these 
sources  as  shown  in  formula  3. 

It  is  important  to  note  the  income  adjustments  that  are  made 
to  the  BEA  data  are  to  account  for  different  population  bases. 
The  BEA  income  data  are  based  on  a  midyear  (July)  population 
figure  for  each  respective  year.  Since  the  changes  have  to  apply 
to  a  census  income  figure  with  a  population  base  as  of  April  of 
the  following  year,  the  BEA  aggregates  are  adjusted  forward  to 
reflect  the  April  population  base  before  the  rate  of  change  is 
calculated  (see  formula  3).  It  should  also  be  noted  that  the  1969 
census  income  data  are  based  on  a  20-percent  sample  population 
base;  these  are  adjusted  to  reflect  the  full  population  count 
from  the  1970  census  since  this  is  the  population  figure  carried 
forward  in  updated  population  estimates. 

Because  of  the  volatile  nature  of  changes  in  county  farm 
income,  relative  to  other  income  gains  and  losses,  a  "constrained 
net"  farm  income  estimate  is  being  utilized  (see  formula  4).  The 
first  step  in  this  procedure  is  the  preparation  of  a  "net"  farm 
income  estimate,  which  between  1969  and  1974  is  developed  by 
adding  the  dollar  change  in  BEA  farm  self-employment  income 
plus  land  rent,  to  the  1969  farm  income  figure  from  the  1970 
census.  The  second  step  is  the  preparation  of  a  "gross  change" 
farm  income  estimate,  which  is  developed  by  applying  the 
percent  change  in  BEA  farm  receipts  plus  the  dollar  change  in 
land  rent  to  the  1970  census  figure.  This  gross  change  estimate 
is  then  used  to  constrain  the  movement  of  net  income.  If  the 
"net"  estimate  falls  between  80-  and  120-percent  of  the  "gross 
change"  estimate,  the  net  estimate  is  used.  If  it  falls  below 
80-percent  of  the  gross  estimate,  the  "constrained  net"  estimate 
is  80-percent  of  the  "gross  change"  estimate.  If  it  is  higher  than 
120-percent  of  the  "gross  change"  estimate,  the  "constrained 
net"  is  120-percent  of  the  "gross  change"  estimate. 

Total  money  income  for  1974  is  the  sum  of  the  income 
types,  and  the  per  capita  income  estimate  is  the  quotient  of 
total  money  income  divided  by  the  April  1975  census  popu- 
lation estimate. 

For  consistency  in  Stale  and  county  totals,  the  county 
income  estimates  arc  controlled  to  State  totals  before  per  capita 
income  is  calculated. 

The  revised  estimates  lor  1972  arc  derived  in  the  same 
manner  as  the  1974  estimates. 


Herriot 


Formula  1.-(for  States) 


r,974CEN"j    ^    p 
[w  &   S    J    '    [j 


1974  IRS  W&  S 
1969  IRS  W&  S 


1969  CEN  W&  S 

1970  Sample  Pop. 


X     1970  Pop. 


Formula  2.-(for  counties) 


[1974CEn1    ^    ri974  IRS  (W&S/EXEMP)1 
W&    S    J         |_1969  IRS  (W&S/EXEMP)J 

ri969CENW&S-|       r^g^^  1 

|_1 970  Sample  Pop.J       |_  J 


1974  CEN  FSE 
(gross  change  est) 


1974  BEA  FR 
1974   Pop. 


X  1975  Pop. 


1969  BEA  FR 
1969   Pop. 


969  CEN  FSE 


X  1970  Pop. 


1970  Sample  Pop. 


1970  Pop. 


(B) 


[1974  BEA  LR  -  1969  BEA  LR| 


[[■ 


1969 CEN    LR 


mr.  c r"S —       ^    1970  Pop. 

970  Sample  Pop.  *^ 


Formula  3.— (for  States  and  counties) 


Where: 


1974  CEN 
INC  (I) 


/1974BEA  INC  (I) 
\         1974  Pop. 


X   1975  Pop. 


n969  BEA  INC 
V         1969  Pop. 


1969  CEN  INC  (I) 


X   1970  Pop. 


)J 


1969  Sample  Pop. 


X  1970  Pop. 


Where: 


INC  (I)  =    Nonfarm  self-employment 

Farm  self-employment  (see  formula  4  for  ad- 
justment for  county  estimates) 
Social  Security 
Public  Assistance 
"Other  Income" 


and  where: 


W  &  S     =    wages  and  salaries 

Exempt  =    number  of  person's  exemptions  on  IRS 

tax  returns 
Pop  =    estimated  number  of  persons 


Formula  4.— (for  county  farm  self-employment  income  only) 
1974  CEN  FSE  (net)  =  [1974  BEA  FSE  +  1974  BEA  LR 

1969 BEA  LR)     (A) 


1969  BEA  FSE  - 
1969  CEN  FSE 


[C 


970  Sample  Pop 


] 


X    1970  CEN    Pop. 


FR      =    BEA  farm  receipts 

FSE    =    Farm  self-employment  income 

LR      =    Land  rent 


1974  CEN  FSE 
(constrained  net  est.) 


=  Aif.8B<A< 
=  .8BifA<.8~ 
=     1.2B  if  A  >  1.2 


.2B 


Where: 


(C) 


A  =   1974  CEN  FSE  (net  est.) 

B  =   1974  CEN  FSE  (gross  change  est.) 


SUBCOUNTY  METHODOLOGY 

The  1974  and  revised  1972  per  capita  income  estimates  for 
subcounty  governmental  units  are  derived  in  somewhat  the  same 
manner  as  those  for  counties.  However,  there  are  differences  in 
the  income  components  used  in  the  estimation  procedure,  an + 
in  the  sources  used  to  update  the  components.  The  basic 
procedure  is  the  application  of  the  rate  of  change  in  IRS 
adjusted  gross  income  per  exemption  and  BEA  county  transfer 
income  per  capita  to  estimates  of  these  components  developed 
from  the  1970  census.  The  1972  estimates  for  each  component 
are  prepared  using  the  rate  of  change  from  1969  to  1972.  The 
1974  estimates  are  based  on  the  1972  estimates,  and  are 
updated  by  an  estimate  of  change  from  1 972  to  1 974.  The  1 972 
per  capita  income  (PCI)  estimates  represent  revisions  to  the 
previously  published  1972  PCI  estimates. 

The  basic  update  procedure  is  straightforward.  However,  due 
to  the  diversity  of  the  geographic  areas  for  which  estimates  are 
being  made,  and  the  data  problems  that  affect  the  quality  and 
reliability  of  the  data  used  in  developing  the  estimates,  selected 
adjustments  and  constraints  were  built  into  the  subcounty 
model.  The  presence  of  these  constraints  and  adjustments  in  the 
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model  may  obscure  the  basic  procedure.  For  our  purposes  in 
presenting  the  methodology  here,  we  have  divided  the  pro- 
cedure into  two  parts:  (1)  The  development  of  a  1969  base  per 
capiu  income  figure,  and  (2)  the  estimation  and  application  of 
the  rate  of  change  in  per  capita  income. 

Development  of  a  1969  PCI  Base  Figure 

In  preparing  the  1970  census  per  capita  income  figures  for 
use  in  the  estimation  process,  four  operations  were  developed: 

(a)  Adjustment  of  Census  Data  for  Annexation  and 
Boundary  Changes-Jo  avoid  a  piecemeal  adjustment  of 
census  data  to  reflect  annexation  and  boundary  changes,  a 
formula  has  been  built  into  the  income  estimation  model  to 
adjust  the  data  for  these  areas  before  the  update  process 
begins.  The  procedure  is  to  estimate  the  census  data  for  the 
annexed  portion  of  the  area,  add  this  amount  to  the  area 
annexing,  and  subtract  the  amount  from  the  area  being 
reduced. 

The  estimate  for  an  annexed  area  is  based  on  a  weighted 
average  for  all  places  in  a  county  which  lost  population  due 
to  annexation  or  boundary  changes.  Formula  5  shows  the 
adjusted  procedure. 

Formula  5.— Adjustment  to  Census  and  IRS  Data  for  Annexation 
and  Boundary  Changes 

FINAL  ITEM.  =  ORIG  ITEM. 


1970  ANNEX  POP.  x 


£       ORIG  ITEM. 
i  =  l 


I 


X         1970  ORIG  POP; 

i  =  1 


I 


Where: 

i   =  each  subcounty  unit  in  county 
j  =  each  subcounty  unit  reduced  by  annexation 
ITEM    =  each  1970census  sample  item, and  each  IRSdata 
item  for  1969,  1972,  and  1974. 

Note:  For  Census  data  original  geographic  base  is  1970.  For 
IRS  data  original  geographic  base  is  1972. 

fh)  1969  PCI  Estimates  for  Small  Areas  When  estimates 
of  1972  PCI  were  originally  prepared  for  revenue  sharing 
purposes,  it  was  decided  that  the  census  per  capita  income 
figures  for  arcas^  with  fewer  than  500  persons  (weighted 
sample  population)  were  of  insufficient  statistical  reliability 
for  use  in  the  estimation  process  due  to  the  large  degree  of 


'The  basic  estimate  unit  is  the  MCD/platc  piece  in  functioning  MCD 
counties  and  place/balance  of  county  in  nonfunctioninn  MCD  counties. 


sampling  variability  present  in  these  data.  Instead,  the  PCI 
value  for  the  county  or  Minor  Civil  Division  (MCD)  was  used 
for  these  areas.  For  the  present  round  of  estimates,  it  was 
determined  that  the  updated  estimates  for  these  places 
should  be  based  on  a  1969  PCI  estimate  which  would  make 
use  of  the  1970  census  sample  PCI  for  these  small  areas. 

The  final  1969  PCI  for  areas  having  a  weighted  sample 
population  estimate  of  less  than  1,000  is  a  weighted  average 
of  the  original  1970  census  sample  value  and  a  regression 
estimate.  The  regression  estimate  utilizes  the  sample  value  as 
the  dependent  variable  and  census  housing  value  (not  a 
sample  item),  county  income  and  housing  data,  and  IRS 
adjusted  gross  income  per  exemption  as  the  independent 
variables  in  the  regression  equation.  Separate  regressions  were 
run  by  State  by  population  size  class  (less  than  500,  and  500 
to  1,000). 

The  weights  applied  to  the  sample  PCI  and  the  regression 
estimate  reflect  a  measure  of  the  variance  in  the  sample 
estimates  relative  to  that  in  the  regression  estimates  as 
determined  by  the  fit  of  the  regression  estimates  to  the 
sample  PCI  figures.  No  weighted  estimate,  however,  is 
allowed  to  deviate  from  the  sample  PCI  figure  by  more  than 
one  standard  error. 

There  is  a  substantial  degree  of  quality  control  exercised 
in  the  development  and  use  of  the  weighted  estimates.  The 
census  and  IRS  data  for  the  regression  estimate  are  tested  for 
reliability  before  they  are  used.  If  one  or  more  of  the 
independent  variable  data  items  are  suspect,  those  variables 
are  dropped  from  the  regression  equation.  Each  estimate  is 
then  flagged  to  indicate  which  variables  were  used  in  the 
regression  estimate,  and  whether  or  not  the  weighted 
estimate  had  to  be  constrained  at  one  standard  error  from 
the  sample  PCI.  If  no  census  sample  data  were  available  for 
the  development  of  a  weighted  estimate,  the  county  or  MCD 
figure  is  plugged. 

For  a  more  detailed  discussion  of  this  procedure  see 
attachment  A. 

(c)  Estimation  of  1969  Adjusted  Gross  Income  and 
Transfer  Income  from  Census  Money  Income  -In  an  effort  to 
use  as  much  subcounty  data  as  possible  for  updating  census 
money  income-specifically,  all  adjusted  gross  income  from 
IRS-census  money  income  for  1969  is  divided  into  two 
parts:  Adjusted  gross  income  and  transfer  income.  Transfer 
income  from  the  census  is  estimated  to  be  the  sum  of  Social 
Security,  public  assistance,  and  a  portion  of  "other  income." 
The  determination  of  the  portion  of  Census'  "other  income" 
to  be  classified  as  transfer  income  is  based  on  a  detailed 
breakdown  of  the  components  of  "other  income"  for 
counties  by  BEA  and  estimates  of  the  number  of  unem- 
ployed, veterans,  students,  etc.,  for  each  governmental  unit 
for  the  1970  census.  Then,  adjusted  gross  income  is 
estimated  to  be  the  difference  between  total  income  and 
transfer  income. 

(d)  Adjustment  of  the  1969  PCI  Estimates  to  Larger  Area 
Controls  One  of  the  single  most  significant  adjustments  to 
the  subcounty  estimates  is  controlling  the  estimates  to  the 
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estimates  of  the  higher  level  geographic  areas.  It  insures  that 
the  sum  of  the  estimates  for  all  pieces  of  geography  in  the 
area  is  the  same  as  an  independent  estimate  for  that  area. 
However,  this  does  not  put  sufficient  control  on  the 
distribution  of  values  within  the  control  area,  especially 
when  allocated  values  are  used.  A  solution  to  this  limitation 
was  found  in  a  two-way  adjustment  procedure  (Multiple 
Univariate  Rake)  which  controls  not  only  higher  level 
geography  totals,  but  also  several  size  class  totals  for  the 
entire  State.  This  determination  was  made  when  tests  showed 
that,  the  average  income  level  of  an  area  especially  for  areas 
with  small  population,  in  a  particular  size  class  (e.g.,  less  than 
500)  tended  to  reflect  the  level  for  all  areas  in  that  class  in 
the  State  more  closely  than  it  reflected  the  county  level. 

The  first  step  in  the  control  procedure  is  to  establish  the 
control  totals— 1970  census  aggregate  money  income  for  each 
county  and  for  all  places  in  selected  size  classes  in  the  State. 
The  1969  aggregate  income  figures  as  developed  in  the 
previous  section  for  the  individual  subcounty  areas  are  then 
repeatedly  adjusted  to  each  of  the  control  totals  until  the 
sum  of  the  areas  in  the  county  is  equal  to  the  county  total, 
and  the  sum  of  areas  in  a  size  class  is  within  1 -percent  of  the 
size  class  total. 

After  this  adjustment  is  completed,  the  final  total  income 
figure  is  divided  into  adjusted  gross  income,  transfer  income, 
and  is  ready  for  use  in  the  update  procedure. 

Estimating  and   Applying  the  Rate  of  Change  in  Per 
Capita  Income 

Estimating  and  applying  the  rate  of  change  in  per  capita 
income  can  also  be  viewed   in   terms  of  four  procedures: 

(a)  Adjustment  of  IRS  Data  for  Annexation  and 
Boundary  Changes— The  annexation  and  boundary  change 
adjustment  for  IRS  data  is  done  with  the  same  formula  as  the 
census  adjustment.  The  only  difference  between  the  two  is 
that  census  data  have  a  1969  geographic  base  and  the  IRS 
data  have  a  1972  geographic  base.  All  IRS  data  items  for 
1 969,  1 972,  and  1 974  are  adjusted  in  this  operation. 

(b)  Data  Replacements  and  Constraints— Due  to  the 
potential  limitations  in  the  IRS  data  used  at  the  subcounty 
level,  a  series  of  edits  were  used  to  insure  representative  data, 
and  a  series  of  data  replacements  and  constraints  were 
developed  to  control  the  estimation  process. 

In  cases  where  (1)  no  IRS  data  was  available,  or  (2)  the 
data  failed  an  edit,  the  rate  of  change  for  the  county  area  was 
used.  For  example,  if  the  number  of  IRS  tax  returns  coded 
to  an  area  was  less  than  25,  the  county  rates  of  change  were 
used. 

(c)  Estimating  and  Applying  the  Income  Rate  of 
Change-As  stated  above,  the  basic  estimation  procedure  is 
the  application  of  the  rate  of  change  in  IRS  adjusted  gross 
income  per  exemption  and  BEA  county  transfer  income  per 


capita  to  the  1969  per  capita  income  base.  Regardless  of 
whether  the  base  is  an  adjusted  figure  or  the  rate  of  change  in 
AG!  is  a  fallback  value,  the  update  procedure  is  the  same. 
The  1972  estimates  are  developed  first,  then  the  1974 
estimate  is  built  on  the  1972  estimate,  applying  a  1972  to 
1974  rate  of  change.  Formula  6  shows  the  equation  for 
1972. 


Formula  6.— 1972  Total  Money  Income  Estimates 


1972  TMY 


1972  IRS  AGI  (P) 


1972  IRS  EXEMP    P 


1969  IRS  AGI  (P) 


1969  IRSEXEMP(P) 


1969  CEN  AGI  EST  (P) 
1970  CEN  POP  (P) 


X   1973  CEN  POP  (P) 


1972  BEA  TR  (C) 
1972  BEA  POP  (C) 
1969  BEA  TR  (C) 
1969  BEA  POP  (C) 


X  1973  CEN  POP  (P) 


1969  CEN  TR  EST  (P) 
1970  CEN  POP  (P) 


Where: 


AGI  =  Adjusted  gross  income 

TR  =  Transfer  payment  income 

C  =  Data  for  county  area 

P  =  Data  for  subcounty  area  (place) 

(d)  Raking  the  Updated  Components  to  County 
Controls -After  the  adjusted  gross  income  and  transfer 
income  components  for  1972  and  1974  are  estimated,  they 
are  raked  in  a  similar  manner  to  the  1969  base.  The  only 
differences  are  that  adjusted  gross  income  and  transfer 
income  are  raked  separately,  and  that  the  State  size  class 
control  totals  are  developed  by  summing  the  individual 
estimates  in  that  class  and  raking  these  size  class  totals  to  the 
State  estimate. 

After  the  adjustment  procedure  is  completed,  adjusted 
gross  income  and  transfer  income  are  combined  and  the  sum 
is  divided  by  the  appropriate  population  estimate  for  the 
final  per  capita  income  estimate. 
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ATTACHMENT   A 


THE  DEVELOPMENT  OF  A  1969  PER  CAPITA 
INCOME  BASE  FOR  PLACES  WITH  1970  CEN- 
SUS DATA  SUBJECT  TO  LARGE  DEGREES  OF 
SAMPLING  VARIABILITY 

Background 

As  noted  in  the  te^t,  the  sample  data  for  small  places  from 
the  1970  census  are  subject  to  large  degrees  of  sampling 
variability,  and  have  been  found  to  lack  sufficient  statistical 
reliability  for  use  in  the  preparation  of  the  1972  and  1974  per 
capita  income  (PCI)  estimates  for  the  Office  of  Revenue 
Sharing.  For  an  interim  solution  to  this  problem,  the  Office  of 
Revenue  Sharing  decided  to  use  the  1969  county  per  capita 
income  figure  for  all  subcounty  units  Vi^ith  a  1970  census 
20-percent  sample  population  less  than  500.  The  Census  Bureau 
concurred  with  this  decision,  with  the  understanding,  that  the 
problem  would  be  resolved  when  the  next  set  of  estimates  was 
produced.  Because  of  the  time  constraints  involved  with  the 
task  of  developing  an  estimation  model,  matching  data  files,  and 
processing  the  38,000  1972  PCI  estimates,  there  were  no 
available  resources  for  research  on  the  sampling  variability 
problem.  The  purpose  of  this  paper  is  to  present  an  alternative 

1969  per  capita  income  base  for  small  places  and  for  use  in  the 
development  of  these  updates. 

Methodology 

Although  there  is  theoretical  justification  for  using  the 
county  PCI  amount  to  stabilize  the  small  subcounty  PCI 
estimates,  tests  have  shown  that  the  use  of  the  county  plug  is  of 
limited  value  and  produces  biased  results.  In  addition,  the 
county  plug  does  not  make  use  of  additional  economic  data 
available  for  these  small  areas,  such  as  housing  value  from  the 

1970  census  and  income  information  from  tax  returns.  In 
theory  a  regression  estimate  using  place  PCI  estimates  as  the 
dependent  variable  and  the  county  PCI  value  as  well  as  the 
additional  data  as  independent  variables  would  be  preferable. 
Although  the  regression  estimates  do  make  use  of  additional 
data  for  the  individual  areas,  they  do  not  constitute  the  best 
estimate  we  could  develop  because  they  do  not  make  full  use  of 
the  information  contained  in  the  sample  PCI  estimates.  A  set  of 
estimates  with  lower  overall  statistical  error  can  be  developed  by 
forming  a  weighted  average  of  the  two  estimates. 

The  regression  estimates  were  prepared  using  the  sample  PCI 
for  the  subject  areas  as  the  dependent  variable.  The  independent 
variables  were  the  1970  census  value  of  owner  occupied  housing 
units  (a  100-percent  item)  and  IRS  adjusted  gross  income  per 
exemption  for  the  subject  area,  and  the  same  variables  plus 
1970  census  per  capita  income  for  the  county  area.  There  is  a 
substantial  degree  of  quality  control  exercised  in  the  develop- 
ment and  the  use  of  the  regression  estimates.  The  census  and 
IRS  data  for  the  regression  estimates  arc  tested  for  reliability 
before  they  arc  used.  If  one  or  more  of  the  independent  variable 


data  items  are  suspect,  based  on  constraints  put  on  the  levels  of 
the  variables,  those  variables  are  dropped  from  the  regression 
equation.  Each  estimate  is  then  flagged  to  indicate  which 
variables  were  used  in  the  equation. 

The  weights  applied  to  the  two  per  capita  figures  are 
measured  by  the  "statistical  error"  present  in  those  figures.  The 
"error"  in  the  sample  estimate  is  the  sampling  variability  or 
variance  in  the  estimate.  The  "error"  in  the  regression  estimate 
is  the  variance  plus  the  bias  squared.  The  variance  in  the 
regression  estimate  is  never  greater  than  that  in  the  sample 
estimate,  and  for  small  places  will  be  substantially  less.  As  a 
result,  the  statistical  error  for  the  regression  estimates  has  a  high 
probability  of  being  smaller  than  that  for  the  sample  estimates 
for  the  subject  areas.  The  weight  for  each  estimate  varies 
inversely  to  their  relative  statistical  error.  For  example,  if  the 
statistical  error  in  the  sample  estimate  is  twice  as  large  as  that 
for  the  regression  estimate,  the  weight  applied  to  the  sample 
estimate  is  .33,  and  that  applied  to  the  regression  estimate  is 
.66.  If  the  error  in  both  estimates  is  relatively  equal,  an  equal 
weight  is  applied  to  each  estimate. 

The  weighted  estimate  is  theoretically  better  than  either  the 
sample  or  the  regression  estimate  because  on  the  average  the 
error  present  will  be  less.  However,  this  does  not  guarantee  that 
the  weighted  PCI  will  have  less  error  for  any  particular  estimate. 
As  a  result,  constraints  have  been  put  on  the  weighted  estimates 
to  control  the  level  of  the  individual  estimates.  No  weighted 
estimate  is  allowed  to  deviate  from  the  sample  estimate  by  more 
than  one  standard  error. 

Empirical  Testing  of  the  1969  Weighted  Per  Capita 
Income  Estimates 

Two  tests  were  designed  to  determine  the  accuracy  of  the 

1969  weighted  per  capita  income  estimates  relative  to  the  1970 
census  sample  PCI  figure  and  the  county  PCI  plug  estimates 
used  for  the  original  round  of  estimates.  The  first  test  can  be 
cited  as  somewhat  the  "ideal"  type  of  test  where  1972  estimates 
based  on  the  three  PCI  amounts  are  compared  to  an  inde- 
pendent per  capita  income  estimate  calculated  from  a  special 
census  in  which  income  was  collected  on  a  100-percent  basis. 
The  test  is  limited  by  the  small  number  of  areas  where  such 
censuses  were  taken-only  24  in  the  size  classes  we  are  dealing 
with  here.  The  second  test  groups  subject  areas  into  groups  of 
10  to  reduce  the  sampling  variability  of  the  1970  census  per 
capita  income  estimates  and  is  used  to  evaluate,  for  the  group, 
the  weighted  estimate  method  and  the  county  plug  method. 
This  test  has  the  advantage  of  utilizing  all  areas  of  interest 
constituting  a  very  substantial  comparison  of  the  weighted 
estimates  and  the  county  plug.  It  cannot  be  used  directly  to 
compare  the  1970  census  PCI  for  a  particular  area  with  the  two 
(jther  estimates. 

The  approach  to  the  "special  census  test"  is  very  simple.  The 

1 970  census  sample  per  capita  income  figure,  the  1 969  weighted 
PCI   estimates,   and   the    1970   census   county    PCI   figure  arc 
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updated  to  1972.  The  absolute  percent  difference  of  each  figure 
from  the  special  census  figure  is  calculated,  and  the  average 
difference  for  all  places  in  two-size  classes  (1970  census  sample 
population  less  than  500  and  between  500  and  999)  is 
compared.  The  results  are  shown  in  table  1 . 

Compared  to  the  special  census  PCI  values  in  the  less  than 
500  population  size  class  (17  areas),  the  average  absolute 
percent  difference  for  the  updated  weighted  estimates  was  22.0 
percent.  This  is  6.5  percentage  points  less  than  that  for  the 
updated  1970  census  sample  figures  (28.6)  and  almost  10 
percentage  points  less  than  that  for  the  updated  county  plug 
figures  (31.6).  In  this  size  class  the  updated  weighted  estimate 
was  closer  to  the  special  census  estimate  than  the  updated 
census  figure  in  10  of  the  17  cases,  and  was  better  than  the 
county  plug  estimates  in  1  3  of  the  17  cases.  The  results  for  the 
500  to  999  size  class  (9  areas)  are  similar.  The  average  absolute 


difference  is  smaller  for  all  3  PCI  figures,  but  at  15.6  percent, 
the  updated  weighted  estimate  is  about  3.5  percent  less  than  the 
difference  in  the  sample  and  county  figures  (19.1  and  19.3, 
respectively).  These  results  of  this  test  do  favor  the  use  of  the 
1969  weighted  PCI  estimate  over  the  two  other  figures,  but  the 
reliability  of  the  results  must  be  considered  suspect  because  of 
the  very  small  number  of  areas  included  in  the  test.  There  are 
almost  10,000  places  and  MCD's  in  the  two-size  classes. 

The  "groups  of  ten  test"  was  designed  to  expand  the 
evaluation  process  to  include  all  areas  in  the  subject  size  classes. 
In  the  absence  of  "special  census"  estimates  to  be  used  as  the 
comparison  base  we  created  a  "better"  1970  census  sample 
figure  by  grouping  the  subject  areas  into  blocks  of  ten  areas 
each.  The  sampling  variability  on  the  PCI  figure  for  a  group  of 
10  areas  is  substantially  less  than  that  on  the  PCI  figures  for 
each  of  the  10  areas.  The  grouping  of  the  areas  is  a  controlled 


Table  1.  Comparison  of  Selected  Per  Capita  Income  Estimates  to  Special  Census  Values  for  1972 

(1970  census  weighted  sample) 


Special  census  areas 


1972 
special 
census 


Estimates  and  percent  difference  from  special 
census 


1970  census 
base 


1972 
esti- 
mate 


Percent 
differ- 
ence 


1969  estimate 
base 


1972 
esti- 
mate 


Percent 
differ- 
ence 


County  or  MCD 
base 


1972 
esti- 
mate 


Percent 
differ- 
ence 


All  areas  with  less  than  500  population. 


Newington,  Ga 

Foosland  village,  111 

Bonaparte,  Iowa 

McNary,  La 

Freeborn  village,  Minn 

Spruce  Valley  township,  Minn. 

Jacksonville,  Mo 

Thayer,  Nebr 

Benton  town,  N.H 


Nora  township,  N.  Dak 

Riga  township,  N.  Dak 

Deer  Creek,  Okla 

Dudley  Borough,  Pa 

Brookings  township,  S.  Dak. 
Valley  township,  S.  Dak.... 
Bryant  township,  S.  Dak.... 
Parrish  town.  Wis 


All  areas  with  500  to  999  population. 


Caswell  plantation,  Maine. 
Sugar  Creek  township,  Mo.. 

Jeromesville,  Ohio 

Rush  township,  Ohio 

Dennison  township.  Pa 

Manor,  Tex 

Derby  Center,  Vt 


(X) 

2,019 
2,899 
2,331 
2,333 
2,741 
2,430 
2,723 
2,742 
1,788 

1,780 
1,454 
2,451 
2,446 
3,132 
1,574 
2,412 
3,567 

(X) 

1,946 
2,224 
3,329 
2,241 
3,521 
2,062 
2,968 


(X) 

2,225 
2,771 
3,126 
2,303 
3,693 
1,894 
2,338 
2,245 
2,874 

2,629 
2,749 
2,493 
2,168 
3,400 
1,946 
1,120 
5,39.9 

(X) 

2,656 
2,035 
3,081 
2,545 
4,411 
2,746 
2,694 


28.6 

10.2 
4.4 
34.1 
1.3 
34.7 
22.1 
14.1 
18. i 
60.7 

47.7 
89.1 

1.7 
11.4 

8.6 
23.6 
53.6 
51.4 

19.1 

36.5 

8.5 

7.4 

13.6 

25.3 

33,2 

9.2 


(X) 

2,302 
3,199 
2,942 
2,527 
3,338 
1,949 
2,611 
2,870 
3 ',284 

2,754 
2,411 
2,673 
2,411 
3,309 
1,972 
2,158 
4,079 

(X) 

2,490 
2,315 
3,418 
2,619 
4,095 
2,765 
2,754 


22.0 

14.0 

10.3 

26.2 

8.3 

21.8 

19.8 

4.1 

4.7 

78.7 

54.7 

65.8 

9.1 

1.4 

5.7 

25.3 

10.5 

14.4 

15.6 

28.0 

4.1 

2.7 

16.9 

16.3 

34.1 

7.2 


(X) 

2,279 
3,796 
2,542 
2,908 
2,922 
2,076 
3,233 
3,452 
3,570 

3,476 
2,711 
2,762 
2,608 
2,395 
2,114 
2,695 
2,721 

(X) 

2,646 
2,018 
3,072 
2,546 
4,430 
2,740 
2,675 


31.6 

12.9 
30.9 

9.1 
24.6 

6.6 
14.6 
18.7 
25.9 
99.7 

95.3 
86.5 
12.7 
6.6 
23.5 
34.3 
11.7 
23.7 

19.3 

36.0 

9.3 

7.7 

13.6 

25.8 

32.9 

9.9 


X  Not  applicable. 
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process  to  avoid  diluting  the  effectiveness  of  the  test.  The  areas 
are  sorted  by  level  of  IRS  adjusted  gross  income  per  exemption 
and  then  grouped  from  that  sort.  The  areas  are  sorted  by 
income  level  because  a  random  grouping  would  result  in  a  large 
number  of  groups  with  an  average  1970  census  PCI  level  which 
could  be  reflected  by  grouped  weighted  estimates  or  county 
estimates  where  large  errors  were  cancelled  out  by  the  grouping 
itself.  When  areas  with  similar  income  levels  are  grouped 
together,  based  on  data  independent  of  the  sample  estimates, 
the  potential  for  homogenizing  the  groups  is  reduced. 

The  areas  are  sorted  by  level  of  IRS  adjusted  gross  income 
per  exemption  because  these  data  are  independent  of  the  1970 
census  sample  income  data,  which,  if  used  as  the  basis  for  the 
sort  and  then  as  the  object  of  the  comparison,  could  themselves 
subject  the  test  to  bias  by  overstating  the  error  in  the  weighted 
estimates  or  the  county  figures.  Using  an  independent  source  in 
the  sort  process  allows  variation  in  the  sample  data  to  be 
reflected,  and  at  the  same  time  permits  group  membership  to  be 
determined  independent  of  the  estimates  derived  from  this 
particular  sample. 

The  sorting  was  done  in  several  stages.  In  the  preparation  of 
the  regression  estimates,  IRS  data  were  not  used  in  the 
regression  equations  if  the  1969  exemptions  per  capita  ratio  was 
outside  the  0.8  to  1 .1  range.  This  condition  occurred  in  about  60 
percent  of  the  cases.  Because  of  this,  it  was  decided  to  evaluate 
separately  the  estimates  where  the  IRS  data  were  not  used  to 
see  if  this  had  a  large  effect  on  the  accuracy  of  the  weighted 
estimates.  In  addition,  we  sorted  MCD's  and  places  separately. 
The  result  is  four  different  sorts:  (1)  MCS's  where  IRS  data 
were  used  in  weighted  estimates,  (2)  places  where  IRS  data  were 


used  in  weighted  estimates,  (3)  MCD's  where  IRS  data  were  not 
used,  and  (4)  places  where  IRS  data  were  not  used. 

Tables  2  and  3  show  the  results  of  the  "groups  often  test." 
The  data  show  that,  with  or  without  IRS  data,  for  MCD's  or  for 
places,  the  1969  weighted  per  capita  income  estimates  for  the 
groups  of  ten  reflected  the  1970  census  sample  PCI  figures  far 
more  accurately  than  did  the  estimates  using  the  county  PCI. 
For  MCD's  with  weighted  estimates  using  IRS  data,  the  "groups 
of  ten"  weighted  estimates  were  closer  to  the  1970  census 
sample  figure  90  percent  of  the  time;  for  place  groupings  they 
were  closer  in  73  percent  of  the  cases.  The  "groups  of  ten" 
weighted  estimates  not  using  IRS  data  in  the  formula  were 
closer  for  86  percent  of  the  MCD  groupings  and  for  80  percent 
of  the  place  groupings. 

The  results  of  the  "groups  of  ten  test"  strongly  favor  the  use 
of  the  1969  weighted  estimates  over  the  1969  PCI  figures  for 
counties.  But  they  are  even  more  significant  than  this.  There  is 
strong  evidence  of  a  large  degree  of  accuracy  in  the  weighted 
estimates  themselves,  regardless  of  their  comparison  to  the 
county  figures.  For  each  of  the  four  groups,  70  percent  or  more 
of  the  grouped  weighted  estimates  are  within  one  standard  error 
of  the  grouped  PCI  estimate.  About  90  percent  or  more  are 
within  two  standard  errors.  In  addition,  the  results  of  this  test 
show  there  is  almost  no  bias  in  the  weighted  estimates  when 
they  are  grouped  together.  Thus,  based  on  statistical  theory  and 
the  results  of  these  tests,  there  is  a  strong  case  for  the 
conclusion  that  the  1969  weighted  estimates  for  the  individual 
areas  would  be  more  reliable  than  both  the  county  and  the 
sample  PCI  figures. 


Table  2.   Relation  of  Per  Capita  Income  Estimates  and  County  Plugs  for  1969  to  1970  Census  Groups  of  Ten 


(places  with   1969    IRS   exemptions    to    1970   census   pop 

ulation  ratio 

between   Che 

0.8    to 

1.1  range) 

1969  MCD's 

and  places 

1969 

MCD's 

1969  places 

Relation  to  1969   sample   PCI 

Estimate 

County  plug 

Estimate 

C;ounCy  plug 

Estimate 

County  plug 

Num- 
ber 

Per- 
cent 

Num- 
ber 

Per- 
cent 

Num- 
ber 

Per- 
cent 

Num- 
ber 

Per- 
cent 

Num- 
ber 

Per- 
cent 

Num- 
ber 

Per- 
cent 

Total  groups 

546 

482 
64 

411 
92 
43 

457 

100.0 

88.3 
LI. 7 

75.3 

16.8 

7.9 

83.7 

546 

249 
297 

146 
144 
256 

89 

100.0 

45.6 
54.4 

26.7 
26.4 
46.9 

16.3 

334 

320 
14 

287 

44 

3 

300 

100.0 

95.8 

4.2 

85.9 

13.2 

0.9 

89.8 

334 

138 
196 

77 

74 

183 

34 

100.0 

41.3 
58,7 

23.1 
22.2 
54.8 

10.2 

212 

172 
40 

149 
28 
35 

154 

100.0 

81.1 
18.9 

70.3 
13.2 
16.5 

72.6 

2  12 

LIL 
101 

61 
60 
91 

58 

100.0 

Within    10  percent  of   sample   PCI.. 
Outside    10  percent   of   sample   PCI. 

Within    I    standard   error 

52.4 
47.6 

28.8 

Between  1  and  2   standard  errors... 
Outside  2   standard   errors 

Closer  to  sample  PCI 

28.3 
42.9 

27.4 

Hcrrlol 
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Table  3.   Relation  of  Per  Capita  Income  Estimates  and  County  Plugs  for  1969  to  1970  Census  for  Groups  of  Ten 


(Places  wich  1969  IRS  exemptions  to  1970  census  pop 

ulation  ratio 

outside  the 

0.8-  tc 

1.1  range) 

1969  MCD's 

and  pi 

aces 

1969 

MCD's 

1969  places 

Rel-acion  to  1969  sample  PCI 

Estimate 

County  plug 

Estimate 

County  plug 

Estimate 

County  plug 

Num- 
ber 

Per- 
cent 

Num- 
ber 

Per- 
cent 

Num- 
ber 

Per- 
cent 

Num- 
ber 

Per- 
cent 

Num- 
ber 

Per- 
cent 

Num- 
ber 

Per- 
cent 

Total  groups. 

903 

780 
123 

67  6 

155 

72 

740 

100.0 

86.4 
13.6 

74.9 

17.2 

8.0 

81.9 

903 

424 
479 

282 
249 
372 

163 

100.0 

47.0 
53.0 

31.2 
27.6 
41.2 

18.1 

473 

427 
46 

375 
84 
14 

406 

100.0 

90.3 
9.7 

79.3 

17.8 

3.0 

85.8 

473 

199 

274 

138 
115 
220 

67 

100.0 

42.1 
57.9 

29.2 
24.3 
46.5 

14.2 


430 

360 
70 

306 
82 
42 

342 

... 

100.0 

83.7 

16.3 

71.2 

19.1 

9.8 

79.5 

430 

213 
217 

143 
127 
160 

88 

100.0 

Within  10  percent  of  sample  PCI.. 
Outside  10  percent  of  sample  PCI. 

Within  1  standard  error. ......... 

49.5 
50.5 

33.3 

Between  1  and  2  standard  errors.. 
Outside  2  standard  errors 

Closer  to  sample  PCI 

29.5 
37.2 

20.5 
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On  the  Comparability  of  Subnational 
Data  from  Different  Census  Bureau 
Household  Surveys 

Earle  J.  Gerson 
Bureau  of  the  Census 

The  Census  Bureau  conducts  numerous  household  surveys 
that  are  sources  of  data  at  the  national  level  and  for  smaller 
geographic  units.  For  the  most  part,  these  surveys  are  conducted 
at  the  request  of  other  Federal  agencies  and  the  Census  Bureau 
plays  a  major  role  in  data  dissemination  for  most  of  these 
studies. 

Given  the  need  for  data  for  States,  metropolitan  areas,  and 
cities,  the  question  arises  as  to  what  can  be  done  to  fully  use 
the  resources  now  available.  Specifically,  this  paper  presents 
descriptions  of  some  sources  of  nonsampling  errors  in  surveys 
and  some  observations  on  the  extent  to  which  data  from  differ- 
ent surveys  can  be  used  to  estimate  changes  overtime  or  be 
combined  to  reduce  sampling  variability. 

This  paper  will  present  brief  descriptions  of  the  major  house- 
hold surveys  that  provide  subnational  data,  comparisons  of 
estimates  from  different  surveys,  and  discussions  of  possible 
reasons  for  the  differences  observed  in  these  comparisons.  The 
principal  comparisons  will  be  of  unemployment,  poverty,  and 
crime  victimization  rates  which  have  important  uses  at  the 
State  and  local  level. 

DESCRIPTION  OF  THE  SURVEYS 

The  Current  Population  Survey  (CPS) 

Since  1942  this  survey  has  provided  monthly  data  on  the 
labor  force  status  of  the  population.  It  also  provides  informa- 
tion on  a  broad  range  of  population  characteristics  on  a  con- 
tinuing basis  and  serves  as  a  collection  vehicle  for  other  topics 
on  an  ad  hoc  basis. 

The  survey  sample  currently  includes  about  65,000  units 
monthly,  of  which  about  55,000  comprises  the  national  sample 
and  the  remaining  10,000  are  supplementary  to  improve  the  re- 
liability of  labor  force  data  for  the  less  populous  States.  There 
are  610  primary  sampling  units  (PSU's)  of  which  461  were 
designated  for  the  national  sample  and  149  for  the  State 
supplement 

Since  the  principal  purpose  of  CPS  has  been  to  provide 
monthly  labor  force  data,  without  excessively  burdening  sample 
households,  the  survey  design  includes  a  rotating  sample  in 
which  a  household  is  interviewed  for  4  consecutive  months, 
is  dropped   from   the  sample  for  8  months,  then  returns  to 


sample  for  4  more  monthly  interviews.  This  feature  has  an  im- 
portant effect  on  the  comparability  of  survey  estimates  and 
will  be  discussed  in  further  detail. 

About  1,300  interviewing  staff  were  used  in  the  CPS.  inter- 
viewers are  intensively  trained  upon  recruitment  when  they 
receive  at  least  8  days  of  training.  In  addition,  the  quality  of 
their  work  is  closely  monitored  through  editing,  field  observa- 
tion, and  reinterview  during  their  average  job  tenure  of  4  years. 

Publications  are  issued  covering  the  monthly  and  annual 
average  labor  force  data  and  the  topics  included  as  supplements 
to  the  survey.  In  addition, a  publicuse  microdata file  isroutinely 
prepared  for  the  March  survey  which  includes  data  on  labor 
force  status,  work  experience  during  the  previous  year,  income, 
and  migration.  For  the  years  1973  to  1976,  this  file  provides 
separate  identification  of  13  States  and  35  standard  metro- 
politan statistical  areas  (SMSA's).  Starting  with  1977,  all  States 
and  44  SMSA's  will  be  identified. 

Except  for  the  annual  average  labor  force  data,  which  are 
used  to  distribute  funds  under  the  Comprehensive  Employ- 
ment and  Training  Act  of  1973,  very  little  information  is 
sufficiently  reliable  to  warrant  regular  analysis  and  publication 
by  the  Census  Bureau  for  individual  States,  metropolitan  areas, 
and  cities.  However,  special  tabulations  can  be  prepared  on 
request  at  cost. 

The  Survey  of  Income  and  Education  (SIE) 

Unlike  the  CPS,  this  one-time  survey  was  designed  specifically 
to  provide  data  for  individual  States.  The  objective  was  to  pro- 
vide data  required  by  the  1974  amendments  to  the  Elementary 
and  Secondary  Education  Act  relating  to  the  number  of  children 
in  families  with  income  below  the  poverty  level  and  the  number 
of  bilingual  persons.  The  survey  also  included  questions  on 
other  topics  of  special  interest  to  the  Department  of  Health, 
Education,  and  Welfare  (HEW). 

It  was  conducted  in  the  spring  of  1976  using  a  sample  of 
191,000  units.  The  largest  State  sample  was  5,900  and  the 
smallest  was  1,900.  Nationally,  there  were  950  sample  PSU's. 

The  interviewer  staff  numbered  about  2,500.  Of  these,  about 
1,600  were  newly  recruited  and  the  remaining  900  were  from 
the  permanent  field  staff  that  works  on  CPS  and  other  current 
surveys.  About  40  percent  to  50  percent  of  the  interviews  were 
conducted  by  the  temporary  staff. 

The  core  questionnaire  used  in  the  SIE  was  identical  to  the 
March  CPS.  In  addition,  it  included  questions  on  bilingualism 
and  the  other  topics  of  special  interest  for  State-level  data  such 
as  education,  disability,  health  insurance  coverage,  food  stamp 
recipiency,  housing  costs,  and  estimated  cash  assets. 

The  required  poverty  data  have  been  provided  to  the  Con- 
gress. Detailed  reports  are  to  be  issued  by  the  Bureau  of  the 
Census  and  the  Departments  of  HEW  and  Labor.  The  Census 
Bureau  will  also  make  available  a  public  use  tape  this  month 
which  will  include  identification  of  all  States  and  122  metro- 
politan areas. 
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The  National  Crime  Survey  (NCS) 

The  purpose  of  the  NCS,  which  began  in  1972,  is  to  obtain 
estimates  of  the  amount  and  nature  of  crime  victimization  in 
the  Nation  by  interviewing  a  sample  of  the  population.  In  addi- 
tion to  the  victimization  data,  this  survey  also  collects  infor- 
mation relating  to  migration  and  labor  force  status. 

The  sample  size  for  this  continuing  national  survey  is  84,000 
units,  one-sixth  of  which  are  interviewed  each  month  in  a  6 
month  cycle.  One-seventh  of  the  units  rotate  out  of  sample  and 
are  replaced  during  each  cycle.  The  incoming  rotation  group  is 
used  only  for  bounding,  which  is  described  below.  The  sample 
is  distributed  over  376  PSU's,  which  are  a  subset  of  those  used 
in  the  CPS. 

During  the  first  few  years  of  the  survey,  it  addition  to  the  na- 
tional survey,  a  substantial  part  of  the  total  effort  was  dedicated 
to  producing  victimization  data  for  individual  large  cities  of 
particular  interest  to  the  sponsoring  organization-the  Law 
Enforcement  Assistance  Administration.  Altogether,  26  differ- 
ent cities  were  surveyed,  13  of  these  were  surveyed  twice,  using 
a  sample  of  12,000  units  in  each  city.  The  original  plan  was  for 
each  city  to  be  surveyed  at  3-year  intervals. 


amount  of  information  comparable  with  other  household 
surveys,  notably  household  counts  and  characteristics. 

The  survey  is  similar  to  the  crime  survey  in  that  it  has  two 
components,  a  national  sample  and  a  set  of  rotating  SMSA 
samples.  The  national  sample  includes  about  80,000  addresses 
in  461  PSU's  which  are  interviewed  late  each  calendar  year  and 
do  not  rotate  out  of  sample.  Thus,  the  survey  has  longitudinal 
features. 

The  SMSA  coverage  has  been  on  a  3-year  cycle.  The  first 
set  of  20  SMSA's  was  interviewed  during  the  12-month  period 
from  April  1974  to  March  1975;  the  second  set  of  20,  from 
1975  to  1976;  and  the  third  set  from  1976  to  1977.  The  sample 
size  in  each  of  4  SMSA's  per  group  is  15,000  equally  divided 
between  central  city  and  the  balance  of  the  SMSA.  For  the 
remaining  16  areas,  the  sample  size  is  5,000  distributed  pro- 
portionally to  the  populations  of  the  central  city  and  balance 
of  area.  Starting  with  the  1978  and  1979  interviewing  period, 
the  SMSA's  will  be  divided  into  four  sets  to  be  interviewed 
over  a  4-year  cycle. 

The  national  data  are  collected  by  the  permanent  staff  of 
interviewers,  and  the  SMSA  data  by  a  temporary  staff  recruited 
to  work  the  12  months  of  interviewing. 


NATIONAL  CRIME  SURVEY  CITIES 


Survey 
year 


1972 
and 
1975 

1973 
and 
1975 

1974 


\  Atlanta,  Baltimore,  Cleveland,  Dallas,  Denver,  Newark,  Port- 
jiand  (Oregon),  and  St.  Louis. 


Cities 


Chicago,  Detroit,  Los  Angeles,  New  York,  and  Philadelphia. 


/"Boston,  Buffalo,  Cincinnati,  Houston,  Miami,  Milwaukee, 
i  Minneapolis,  New  Orleans,  Oakland,  Pittsburgh,  San  Diego, 
\  San  Francisco,  and  Washington,  D.C. 


The  permanent  field  staff  that  collects  data  for  other  current 
surveys  also  does  the  interviewing  for  the  National  Crime  Sur- 
vey. However,  the  city  surveys  were  conducted  in  a  2-  to  3- 
month  period  using  a  newly  recruited  temporary  staff  for  each 
round  of  data  collection. 

Survey  results  are  analyzed  by  the  Census  Bureau  and  the  re- 
ports are  issued  by  LEAA.  Microdata  tapes  are  also  available  for 
both  the  National  and  city  samples,  but  there  is  no  identifica- 
tion of  individual  States,  SMSA's,  or  cities  on  the  national 
tapes.  Rather,  general  descriptions  such  as  neighborhood  char- 
acteristics, place  size,  and  place  descriptions  are  provided  for 
analytical  purposes. 

The  Annual  Housing  Survey  (AHS) 

This  survey  provides  a  continuing  measure  of  the  size  and 
characteristics  of  the  Nation's  housing  inventory.  Conducted 
since  1973  under  the  sponsorship  of  the  Department  of  Hous- 
ing and   Urban    Development,   the   survey   provides  a  limited 


ANNUAL  HOUSING  SURVEY 

SMSA's  interviewed- April  1974  to  March  1975  and 
April  1977  to  March  1978 

Sample  size— 15,000  housing  surveys 

Boston,  Mass. 

Washington,  D.C.-Md.-Va. 

Detroit,  Mich. 

Los  Angeles-Long  Beach,  Calif. 

Sample  size-5,000  housing  units 

Albany-Schenectady-Troy,  N.Y. 

Newark,  N.J. 

Pittsburgh,  Pa. 

Memphis,  Tenn.-Ark.-Miss. 

Orlando,  Fla. 

Minneapolis-St  Paul,  Minn.-Wis. 

Saginaw,  Mich. 

Dallas-Fort  Worth,  Tex. 

WiciTita,  Kans. 

Salt  Lake  City-Ogden,  Utah 

Anaheim-Santa  Ana-Garden  Grove,  Calif. 

Phoenix,  Ariz. 

Spokane,  Wash. 

Tacoma,  Wash. 

Madison,  Wis.* 


*Not  interviewed  April  1974  and  March  1975. 
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ANNUAL  HOUSING  SURVEY-Continued 
SMSA's  inverviewed— April  1975  to  March  1976 

Sample  size-1 5,000  housing  units 

Philadelphia,  Pa.-N. J. 

Atlanta,  Ga. 

Chicago,  III. 

San  Francisco-Oakland,  Calif. 

Sample  size— 5,000  housing  units 

Hartford,  Conn. 

Springfield-Chicopee-Holyoke,  IVlass.-Conn. 

Paterson-Clifton-Passaic,  N.J. 

Rochester,  N.Y. 

Newport  News-Hampton,  Va. 

Miami,  Fla. 

Cincinnati,  Ohio-Ky.-lnd. 

Columbus,  Ohio 

Milwaukee,  Wis. 

New  Orleans,  La. 

San  Antonio,  Tex. 

Kansas  City,  Mo.-Kans. 

Colorado  Springs,  Colo. 

Riverside-San  Bernardino-Ontario,  Calif. 

San  Diego,  Calif. 

Portland,  Oreg.-Wash. 

Madison,  Wis. 

SMSA's  interviewed-May  1976  to  April  1977 

Sample  size-1 5,000  housing  units 

New  York,  N.Y.-N.j. 
Houston,  Tex. 
St.  Louis,  Mo.-lll. 
Seattle- Everett,  Wash. 

Sample  size-5,000  housing  units 

Providence-Warwick-Pawtucket,  R.I. -Mass. 
Buffalo,  N.Y. 

Allentown-Bethlehem-Easton,  Pa.-N. J. 
Baltimore,  Md. 
Birmingham,  Ala. 
Louisville,  Ky.-Ind, 
Raleigh-Durham,  N.C. 
Cleveland,  Ohio 

Indianapolis,  Ind. 
Grand  Rapids,  Mich. 
Oklahoma  City,  Okla. 
Omaha,  Ncbr.-lowa 
Denver-Boulder,  Colo. 
Honolulu,  Hawaii 
Las  Vegas,  Nev. 
Sacramento,  Calif. 


COMPARABILITY  OF  SURVEY  RESULTS 

Several  sources  of  subnational  data  have  been  described. 
Since  each  is  designed  for  a  specific  purpose,  the  content  over- 
lap is  limited.  However,  the  overlap  is  sufficient  to  illustrate 
the  differences  in  results  that  can  occur  as  a  result  of  differ- 
ent methodology,  question  framing,  and  other  aspects  of  survey 
operations. 

Let's  turn  first  to  some  examples  of  differences  encountered 
in  comparison  of  survey  results,  the  first  example  being  the  dif- 
ferences between  the  CPS  and  SIE. 

The  second  example  will  be  the  comparison  between  the 
labor  force  status  data  from  CPS  and  the  Crime  Survey. 

The  third  example  is  the  comparison  of  crime  victimization 
data  for  several  large  cities  as  derived  from  the  National  Crime 
Survey  and  the  Cities  Crime  Survey. 

SOURCES  OF  DIFFERENCES 


These  comparisons  illustrate  the  differences  that  result  from 
independent  efforts  to  measure  the  same  phenomenon.  Since 
most  of  the  differences  presented  are  statistically  significant,  we 
shall  now  examine  briefly  some  aspects  of  the  surveys  that 
may  account  for  the  differences  and  consider  the  implications 
for  joint  use  of  data  from  different  sources. 

Concepts  and  Question  Framing 

It  is  obvious  that  to  have  the  same  expected  value,  surveys 
should  employ  the  same  concept.  Related  to  this  concern  with 
concept  is  how  the  questions  are  phrased,  since  slight  differences 
in  wording  or  the  context  in  which  they  are  presented  can 
have  a  very  substantial  effect  on  the  results.  There  is  extensive 
literature  on  techniques  for  developing  questionnaires  to  over- 
come respondent  reluctance,  to  avoid  certain  biases,  and  to  fully 
convey    what    information    is    requested    of   the    respondent. 

Although  there  is  wide  familiarity  with  the  principles  in- 
volved in  dealing  with  these  and  related  problems,  the  amount 
of  attention  given  to  this  subject  varies  with  the  focus  of  the 
study.  The  major  subject  of  a  survey  usually  recieves  much  more 
intensive  consideration  than  secondary  or  classification  variables. 
As  an  example,  the  questionnaire  space  and  concern  for  maxi- 
mum accuracy  of  reports  on  a  subject  could  be  quite  different 
for  a  health  survey  using  labor  force  status  as  a  classifier  and  a 
labor  force  survey  using  health  as  a  classifier. 

Yet  much  can  be  done  to  standardize  questions  without 
excessive  rigidity  as  illustrated  by  recent  efforts  of  the  Social 
Science  Research  Council  and  the  Statistical  Policy  Division  of 
the  Office  of  Managment  and  Budget. 

Population  Coverage 

For  comparability,  it  is  necessary  that  the  surveys  cover  the 
same  universe,  not  only  conceptually  but  in  practice.  Census 
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household  surveys  have  been  studied  with  regard  to  coverage 
losses.  These  studies  indicate  that  a  small  part  of  the  U.S.  popu- 
lation is  not  represented  because  of  deficiencies  in  the  sampling 
frame  or  failure  in  the  field  to  identify  and  include  all  persons 
who  belong  in  the  sample. 

Montie  and  Schwanz'  in  a  paper  presented  at  this  meeting, 
have  described  the  recognized  gaps  in  census  household  survey 
sampling  frames.  The  combined  impact  of  these,  plus  the  field 
misses  can  be  summarized  as  a  ratio  of  the  survey  results  (in- 
flated by  the  inverse  of  the  sampling  rate)  to  independent  esti- 
mates of  the  universe.  This  is  routinely  done  in  major  census 
studies.  In  CPS,  the  overall  coverage  ratio  for  persons  14  years 
old  and  over  has  averaged  around  96  percent.  Since  the  in- 
dependent estimate  used  as  a  control  excludes  persons  missed 
in  the  decennial  census,  the  best  estimate  of  survey  under- 
coverage  is  about  2.5  percent  higher.  This  overall  figure,  how- 
ever, encompasses  groups  with  widely  different  coverage  ratios. 
For  example,  young  Black  men  are  usually  missed  at  a  rate 
several  times  that  for  middle  aged  White  males.  To  the  extent 
that  those  covered  in  the  survey  have  the  same  characteristics 
as  those  missed  and  an  adjustment  is  made  for  the  number 
missed,  the  survey  results  may  not  be  seriously  affected.  How- 
ever, this  assumption  cannot  easily  be  made. 

Adjusting  survey  data  to  independent  estimates  of  the  popu- 
lation reduces  sampling  variability  and  adjusts  for  survey  under- 
coverage.  In  most  census  surveys  designed  to  produce  sub- 
national  data,  subnational  controls  are  used,  while  in  most  na- 
tional surveys  in  which  subnational  data  are  byproudcts,  they 
are  not  e.g.,  in  the  CPS  annual  demographic  file.  When  the 
survey  weighting  is  on  a  national  basis.  State  and  local  differ- 
entials in  coverage  ratios  are  not  recognized  and  adjusted  for 
as  when  State  or  local  controls  are  used. 

Nonresponse 

Another  aspect  of  household  surveys  that  can  effect  com- 
parability is  nonresponse,  either  to  the  entire  questionnaire  or 
some  part  of  it.  Although  imputation  of  missing  data  is  usually 
performed,  the  biases  that  may  arise  are  difficult  to  quantify. 

In  most  of  the  major  household  surveys  conducted  by  the 
Census  Bureau,  the  household  nonresponse  rates  tend  to  vary 
only  slightly.  Somewhat  greater  variation  in  item  nonresponse 
has  recently  been  experienced.  Imputation  for  nonresponse 
households  is  usually  based  on  race  and  residence.  For  item 
nonresponse  additional  characteristics  of  the  household  or 
individual  are  often  used. 

Conditioning  of  the  Sample 

Sometimes  referred  to  as  a  rotation  group  bias,  this  refers 
to  the  fact  that,  in  surveys  featuring  repeat  interviewing  with 


'  Irene  C.  Montie  and  Dennis  ].  Schwanz  "Coverage  Improvements  in 
the  Annual  Housing  Survey,"  paper  presented  at  the  ASA  meeting, 
Chicago,  III.,  1977. 


the  same  household  overtime,  household  reports  may  differ 
as  a  result  of  there  having  been  earlier  interviews. 

Surveys  with  rotating  samples  such  as  CPS  provide  an  oppor- 
tunity to  study  this  phenomenon.  In  CPS,  households  are  in- 
terviewed eight  times,  4  consecutive  months  in  1  year  and  the 
same  4  months  the  following  year.  Review  of  the  data  by 
rotation  group  or  month  in  sample  reveals  that  initial  CPS 
interviews  yield  higher  reports  of  employment  and  unemploy- 
ment than  the  other  rotations.  Unemployment  estimates  from 
households  in  the  first  month  of  interviewing  have  averaged 
about  10  percent  higher  than  those  for  the  full  sample.  Little 
is  known  about  the  causes  of  these  and  other  observed  con- 
ditioning effects. 

For  subjects  known  to  have  such  effects,  comparing  or  com- 
bining results  from  surveys  with  different  conditioning  patterns 
should  be  approached  cautiously. 

Respondent  Recall 

When  retrospective  data  are  obtained  in  surveys,  frequently 
a  calendar  reference  period  is  designated.  This  period  is  usually 
set  to  reflect  a  balance  between  reliability  concerns  and  ability 
to  recall  events  accurately.  The  longer  the  reference  period  for 
which  recall  is  adequate,  the  smaller  the  required  sample  size 
is  to  achieve  a  target  reliability. 

Two  basic  aspects  of  memory  problems  are  the  ability  to 
recall  an  event  altogether  and  the  ability  to  date  it  correctly. 
Placing  an  event  at  a  later  date  than  it  actually  occurred  is 
called  forward  telescoping  and  at  an  earlier  date,  backward 
telescoping.  Personal  surveys  with  bounded  interviews  provide 
opportunities  to  control  for  telescoping  across  reference  periods, 
but  do  not  prevent  telescoping  within  reference  period.  Bound- 
ing is  the  use  of  previous  reports  to  prevent  duplicate  report- 
ing of  an  event  in  a  later  interview. 

Another  concern  relates  to  the  proximity  of  the  interview  to 
the  reference  period.  For  example,  an  interview  covering  calen- 
dar year  1977  conducted  in  June  1978,  might  have  very  differ- 
ent results  from  a  January  1978  interview  covering  the  same 
period.  Both  recall  decay  and  forward  telescoping  out  of  the 
reference  period  may  occur. 

Collection  Procedures  and  Respondent  Rules 

Survey  data  may  also  be  affected  by  the  method  of  data 
collection.  For  some  surveys  and  subjects,  difference  shave  been 
noted  when  personal  interviews  are  compared  with  telephone 
or  mail  data  collection.  In  two  recent  tests  comparing  a  com- 
pletely personal  interview  procedure  with  mail  collection  and 
with  telephone  interviewing,  both  including  personal  followup 
of  nonrespondents,  personal  interviews  elicited  higher  levels 
of  reporting. 

Similarly  survey  results  may  be  affected  by  choice  of  respond- 
ent. Depending  on  the  subject  covered,  different  results  might 
be  experienced  by  selecting  one  person  to  report  for  an  entire 
household  rather  than  having  each  person  report  for  himself. 
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Timing  and  Seasonality 

Many  topics  of  surveys  exhibit  substantial  seasonal  variation. 
Care  should  be  used  in  combining  or  comparing  data  from  sur- 
veys that  may  reflect  seasonality  in  different  ways.  As  an 
example,  the  National  annual  housing  survey  has  generally  been 
conducted  in  October  and  November  of  each  year,  whereas  the 
SMSA  surveys  are  conducted  over  a  12-month  period  running 
from  April  to  March.  Comparisons  of  data  relating  to  the  cur- 
rent status  of  the  unit  or  household  such  as  vacancy  status, 
rent,  and  perceived  neighborhood  problems  could  be  affected 
by  the  season  in  which  the  report  was  obtained. 

Interviewer  Staffing 

Another  variable  in  census  surveys  relates  to  the  performance 
of  interviewers.  In  general,  census  attempts  to  prepare  interview- 
ers for  their  assignments  thoroughly.  For  interviewers  who  work 
on  CPS,  about  1  week  of  training  is  provided  before  any  inter- 
views are  conducted.  Then  3-or-more  days  of  on-the-job  training 
are  required. 

Studies  have  indicated  that  after  this  training,  it  takes  an  in- 
terviewer over  2  years  of  CPS  assignments  to  reach  peak  per- 
formance with  respect  to  such  measures  as  noninterview  rates, 
reinterview  results,  edit  problems,  and  efficiency. 

Although  the  permanent  experienced  staff  is  used  as  much  as 
possible,  some  surveys  require  hiring  a  temporary  staff  of  in- 
terviewers, in  the  city  crime  surveys  a  temporary  staff  of  about 
125  interviewers  was  recruited  in  each  city  to  work  about  2 
months.  The  field  supervisor's  ability  to  train  such  a  staff 
thoroughly  and  to  maintain  quality  standards  equivalent  to 
those  in  continuing  surveys  is  in  such  circumstances  less  than 
desirable.  Whereas,  for  experienced  interviewers,  their  pre- 
viously  developed   skills  generally   carryover  to  new  surveys. 

These  are  illustrative  of  the  elements  that  may  effect  the 
joint  use  of  data  from  different  surveys.  There  are  others,  but 
these  in  particular  bear  on  the  specific  comparisons  described 
below. 


DISCUSSION  OF  THE  OBSERVED  DIFFERENCES 

Data  derived  from  three  pairs  of  surveys  are  presented  in 
tables  1  to  4.  The  following  relates  to  the  features  of  the  surveys 
that  may   account   for   the   significant  differences  presented. 

Poverty  and  Income  Data  From  CPS  and  SIE 

Table  1  indicates  that  the  poverty  count  from  SIE  was  7  per- 
cent lower  than  CPS.  The  mean  family  income  was  4  percent 
higher,  and  the  number  of  wives  in  the  paid  work  force  was 
also  4  percent  higher. 

The  questionnaire  and  the  reference  period  were  identical 
with  regard  to  these  topics.  However,  other  features  of  the 
survey  were  notably  different,  one  being  the  objective  of  the 
surveys. 


Table  1.   Estimates  Derived  From  SIE  and  CPS  for 
Selected  Items 


(Thousands ,  except 

Eis  noted) 

Item 

sie' 

CPS  2 

Persons  In  poverty . .millions  .  . 

Mean  family  Income 

Wives  in  paid  labor  force 

In  metropolitan  areas  outside 
central  cities 

24.0 

$16,142.0 

21,688.0 

83,549.0 
66,224.0 

25.9 

.$15,546.0 

20,833.0 

81  963.0 

Not  in  metropolitan  areas 

68,206.0 

^Survey  of  Income  and  education. 
^Current  population  survey. 


CPS  is  primarily  a  monthly  labor  force  survey  that  includes 
supplemental  questions  in  most  months  and  sorhe  of  these  sup- 
plements recur  annually.  Each  March,  the  questionnaire  includes 
questions  on  income,  annual  work  experience,  and  migration. 
The  interviewers  are  advised  that  their  primary  responsibility 
is  to  obtain  accurate  and  complete  labor  force  reports.  Ad- 
ditional information  for  supplements  should  not  result  in  chang- 
ing the  labor  force  report,  nor  should  it  be  permitted  to  serious- 
ly jeopardise  the  cooperation  of  the  household  in  future  months. 

In  contrast,  interviewers  and  respondents  were  told  that  the 
principal  purpose  of  the  SIE  was  to  obtain  income  information. 
Since  there  were  no  future  interviews  with  the  same  households 
planned,  it  was  possible  to  be  somewhat  more  vigorous  in 
efforts  to  obtain  complete  reporting,  particularly  in  reducing 
item  nonresponse  for  income. 

A  feature  that  may  have  contributed  to  lower  nonresponse 
in  SIE  was  the  longer  interviewing  period.  In  CPS  the  data  col- 
lection period  is  basically  1  week  as  compared  to  10  weeks  for 
SIE.  Thus,  for  SIE  there  was  ample  opportunity  to  recontact 
households  for  missing  information  or  to  speak  with  the  best 
qualified  respondent  within  the  household. 

In  CPS  about  40  percent  of  the  interviews  were  conducted 
by  telephone,  a  procedure  used  in  all  but  the  first  and  fifth 
month  interviews.  All  SIE  interviews  were  designated  for  per- 
sonal contact,  although  telephone  use  was  permitted  for  re- 
contacts  for  data  that  could  not  be  obtained  during  the  initial 
visit.  There  is  some  indication  that  telephone  interviews  in  CPS 
are  less  complete  than  personal  interviews  in  respect  to  the 
annual  income  supplement. 

Another  feature  that  may  have  affected  the  overall  quality 
of  the  two  surveys  is  the  timing  of  the  interviews.  Interviewing 
for  CPS  took  place  in  mid-March  but  for  SIE  it  was  in  May, 
June,  and  July.  The  advantage  of  the  later  interviewing  was 
that  the  income  tax  filing  date  has  passed.  A  possible  dis- 
advantage comes  from  the  fact  that  interviews  in  May,  June, 
and  July  are  more  removed  from  the  reference  year  of  1975. 
To  the  extent  that  income  records  were  not  used,  memory 
decay  and  telescoping  could  have  been  problems. 
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Offsetting  the  positive  features  of  the  SIE  compared  to  those 
of  CPS  was  the  heavy  use  of  inexperienced  interviewers  in  the 
SIE.  As  noted  above,  about  half  of  the  SIE  interviews  were 
completed  by  temporary  staff.  During  the  short  time  the  data 
collection  lasted  it  was  not  possible  to  implement  quality 
standards  to  the  same  extent  as  in  a  continuing  survey  like  CPS. 

It  is  difficult  to  isolate  the  effects  of  any  one  aspect  of  the 
survey  operation,  but  one  can  examine  some  measures  that  can 
be  considered  indications  of  data  quality.  The  household  non- 
interview  rate  and  item  nonresponse  were  lower  in  SIE  than  in 
CPS.  In  the  March  1976  CPS,  19.5  percent  of  all  persons  failed 
to  answer  all  income  question  in  contrast  with  13.0  percent  in 
SIE.  This  resulted  in  allocation  of  20.1  percent  of  CPS  income 
and  12.1  percent  of  SIE  income.  Population  coverage,  on  the 
other  hand,  was  poorer  in  SIE.  Finally,  the  re  interview  program 
indicated  greater  underreporting  of  income  in  CPS  than  in  SI  E. 

On  the  question  of  whether  the  CPS  and  SIE  data  should 
be  combined  to  reduce  sampling  variability,  the  Census  Bureau 
and  HEW  concluded  that  they  should  not.  Overall,  SIE  appeared 
to  be  the  more  successful  income  poverty  survey  despite  the 
problems  associated  with  the  use  of  an  inexperienced,  tem- 
porary interviewer  staff.  The  CPS  results  were  sufficiently  dif- 
ferent from  SIE,  particularly  at  the  State  level,  to  make  a 
combined  estimate  undesirable  for  the  purposes  stated  in  the 
legislation  mandating  the  program. 


Labor  Force  Data  From  CPS  and  NCS 

Table  2  covers  first  month  households  only  and  table  3 
covers  all  households,  showing  differencesof  20  percent  and  10 
percent,  respectively,  in  the  unemployment  rate,  with  the  NCS 
lower  in  both  instances. 

How  do  the  surveys  differ  that  might  account  for  the  differ- 
ence? 

The  first  notable  difference  is  in  the  questionnaire.  The  NCS 
battery  of  questions  is  shorter  and  lacking  some  items  needed 
to  fully  implement  the  CPS  classification  scheme.  Determination 
of  employment  status  from  imperfectly  filled  questionnaires 
also  follows  slightly  different  rules  in  the  two  surveys. 

In  CPS,  a  household  respondent  provides  reports  for  all  per- 
sons in  the  household,  whereas  in  NCS  each  person  responds 
for  himself.  At  least  one  personal  visit  per  household  is  re- 
quired for  NCS,  but  not  for  CPS  in  all  months  of  interview. 
Thus,  there  is  more  telephone  interviewing  in  CPS  than  in  NCS. 

The  purposes  of  the  surveys  are  different,  giving  primary 
attention  of  interviewer  and  respondent  to  crime  victimization 
in  the  NCS  and  to  labor  force  status  in  CPS.  Limited  time  and 
attention  are  given  to  the  labor  force  questions  in  NCS  training 
as  relative  to  CPS. 

The  CPS  estimation  procedure  includes  a  compositing  of  a 
measure  of  change  and  of  level  in  order  to  reduce  sampling  vari- 
ability on  month-to-month  change.  For  consistency,  annual 
average  estimates  are  based  on  composited  monthly  data.  The 
effect  of  compositing  CPS  data  that  exhibit  rotation  bias  is 
an  expected  value  different  from  noncomposited  data.  If  CPS 


noncomposited  data  were  compared  with  NCS  data,  the  dif- 
ferences would  be  greater  than  those  shown  above. 

The  reference  period  also  varies.  In  CPS,  it  is  the  week  con- 
taining the  twelfth  day  of  the  month;  in  NCS  it  is  the  week  be- 
fore the  interview,  and  interviews  are  conducted  during  the 
first  2  weeks  of  the  month. 

The  conclusion  again  is  that  the  surveys  are  sufficiently  dif- 
ferent and  that  for  many  purposes  joint  use  of  the  data  is 
inadvisable. 


Victimization  Data  From  tlie  National  and  City 
Crime  Surveys 

Table  4  shows  that  half  of  the  victimization  rates  derived 
from  the  city  surveys  were  significantly  different  from  national 
survey  data  for  the  same  cities.  In  four  out  of  five  significant 
differences  the  city  survey  estimates  were  higher  than  the  na- 
tional survey  estimates. 

A  major  difference  between  the  surveys  as  noted  above  is 
that  the  national  survey  is  a  continuing  survey  using  the  staff 

Table  2.   Comparison  of  NCS  and  CPS  Labor  Force  Distri- 
butions for  First-Month  Households:  January-June  1973 

(Percent ) 


Labor  force  status 


All  Persons  Interviewed 

Total  Interviewed  persons , 
16  or  more 

In  labor  force 

Employed 

Working 

With  job,  not  at  work... 
Unemployed 

Not  In  labor  force 

Keeping  house 

Going  to  school 

Unable  to  work 

Other 

In  and  Not  in  Labor  Force 

Total  interviewed  persons, 
16  or  more 

In  labor  force 

Employed 

Working 

With  job,  not  at  work... 
Unemployed 

Not  in  labor  force 

Keeping  house 

Going  to  school 

Unable  to  work 

Other,  including  retired.. 

^National  crime  survey. 
^Current  population  survey. 


NCS' 


100.0 


CPS' 


100,0 


61.5 

61.3 

58.7 

57.8 

56.0 

55.2 

2.7 

2.7 

2.8 

3.5 

38.5 

38.7 

34.4 

23.9 

5.1 

5.3 

2.2 

1.8 

7.9 

7.6 

100.0 

100.0 

95.4 

94.3 

91.0 

89.9 

4.4 

4.3 

4.6 

5.7 

100.0 

100.0 

60.5 

61.9 

13.4 

13.8 

5.6 

4.7 

20.6 

19.7 

22 


ESTIMATES,  SURVEYS,  AND  FORECASTS 


Table  3.   Comparison  of  NCS  and  CPS  Labor  Force 
Distributions,  All  Rotations:    1974  and  1975 


Table  4. 


Comparison  of  NCS  National  and  Cities  Sample 
Data  for  Five  Largest  Cities:    1974 


(Percent ) 

Labor  force  status 

1974 

1975 

NCS 

CPS 

NCS 

CPS 

All  Persons 

Total  civilian  noninsti- 
tutional  population, 
16  or  more 

100.0 

61.2 

58.1 

3.1 

38.8 

100.0 

95.0 

5.0 

100.0 

61.2 

57.8 

3.4 

38.8 

100.0 

94.4 

5.6 

100.0 

61.5 

57.5 

4.0 

38.5 

100.0 

93.5 

6.5 

100.0 

61.2 

56.0 

5.2 

38.8 

In  labor  force 

Employed 

Unemployed 

Not  in  labor  force 

Labor  Force 

Total  civilian  noninsti- 
tutional  population, 
16  or  more 

In  labor  force 

Employed 

Unemployed 

Not  in  labor  force.... 

100.0 

91.5 

8.5 

of  experienced  interviewers.  The  cities  surveys,  wliich  have  been 
discontinued,  were  scheduled  to  be  conducted  at  intervals  using 
temporary  staff.  The  concerns  regarding  the  use  of  temporaries 
presented  earlier  apply  here  as  well.  Because  the  goals  of  the 
two  undertakings  were  somewhat  different,  the  designs  were 
different  in  important  ways.  The  national  program  reference 
period  is  6  months  with  telescoping  across  reference  periods 
controlled  by  bounding.  In  city  survey,  interviews  being  spaced 
at  intervals,  greater  than  the  length  of  the  reference  period 
no  bounding  was  possible.  Since,  on  the  basis  of  experimenta- 
tion, it  appeared  that  recall  decay  was  not  a  serious  problem 
for  a  12-month  period  as  compared  with  6  months,  the  longer 
period  was  used  in  the  cities. 

Because  of  telescoping  concerns  relative  to  the  manner  in 
which  the  results  were  to  be  compiled  the  longer  period  did  not 
appear  appropriate  for  the  national  survey. 

The  effect  of  using  unbounded  data  for  the  cities  would 
be  to  increase  victimization  reports. 

Again  the  surveys  differ  so  substantially  that  it  was  con- 
cluded in  a  report  by  the  Committee  of  National  Statistics  of 
the  National  Academy  of  Sciences  that  the  two  surveys  be 
integrated  into  a  single  data  collection.^  The  discontinuation 
of  the  city  surveys  was  partly  based  on  this  recommendation. 

SUMMARY  AND  CONCLUSIONS 

This  paper  has  presented  very  brief  descriptions  of  surveys 
that   because  of  their  size  or  design  features  can  be  used  to 

'National  Research  Council.  Hand  for  the  Lvaluation  of  Crime  Sur- 
veys, Surveying  Crime  (Washington,  D.C.,  National  Academy  of  Sciences, 
1976)  pp.  4-5. 


Personal 

Hou 

sehold 

Item 

victimization 
rate* 

victimization 
rate' 

(1.000) 

(1 

000) 

Chicago : 

National  sample 

156 

212 

Cities  sample 

152 

245 

Detroit  : 

National  sample 

241 

319 

Cities  sample 

^68 

330 

Los  Angeles: 

National  sample 

166 

285 

Cities  sample 

178 

'332 

New  York : 

National  sample 

76 

114 

Cities  sample 

3 108 

nsi 

Philadelphia: 

National  sample 

122 

166 

Cities  sample 

134 

^209 

'Personal    victimizations — rape,    robbery,    assault, 
personal   larceny. 

^Household    victimizations — burglary,    household 
larceny,    auto   theft. 

'significantly  different    from  the  national   sample 
estimate   at    the   95    percent    confidence    level. 


develop  subnationa!  estimates.  Selected  features  of  survey 
design  and  operations  that  can  effect  comparability  were  re- 
viewed. Finally,  summary  comparisons  were  made  for  three 
pairs  of  surveys,  showing  in  each  instance  the  results  were 
significantly  different. 

For  each  example  the  recommendation  was  to  avoid  joint  use 
of  the  data  and  there  may  be  applications  in  which  joint  use 
would  be  indicated.  If  an  examination  of  the  data  revealed  no 
important  differences  for  a  characteristic  of  interest  (e.g., 
high  income  families  from  CPS  and  SIE,  personal  injury  data 
from  the  crime  surveys),  then  it  would  be  appropriate  to  com- 
bine them  to  reduce  sampling  variability.  For  changes  overtime, 
it  would  be  appropriate  to  use  data  from  different  sources  if 
there  is  evidence  that  methodology  would  create  small  differ- 
ences relative  to  the  actual  changes  overtime  and  the  precise 
magnitude  of  change  is  notpf  great  concern. 

For  other  uses  of  the  data  from  a  survey,  notably  as  descrip- 
tors and  for  inlerarea  comparisons,  a  different  set  of  criteria 
should  be  used  in  judging  the  utility  of  the  data.  Fewer  prob- 
lems are  encountered  in  such  uses. 

Although  this  paper  points  out  existing  differences  between 
survey  estimates  and  some  features  of  the  surveys  that  may  ac- 
count for  the  differences,  it  is  difficult  to  quantify  the  affect 
of  each  feature.  Some  data  are  available  on  bounding  and  rota- 
tion effects,  but  more  work  is  required  on  respondent  rules,  in- 
terviewer training,  population  undercoverage,  and  nonresponse. 


Triioh 


23 


Problems  in  Matching  Federal  Allocation 
Formulas  to  Program  Objectives 


Charles  D.  Troob 

National  Institute  of  Education 

INTRODUCTION 

During  the  past  year,  a  subcommittee  of  the  Federal 
Committee  on  Statistical  Methodology,  chaired  by  Wray  Smith 
of  the  Department  of  Health,  Education  and  Welfare  (HEW), 
has  been  examining  the  use  of  statistics  in  the  allocation  of 
funds.  The  subcommittee,  several  of  whose  members  are  also  on 
the  parent  Committee,  consists  of  Federal  officials  from  a 
number  of  different  agencies,  including  some  with  responsibility 
for  the  data  actually  used  in  grant-in-aid  formulas.  Members  of 
the  subcommittee  produced  case  studies  of  the  formulas  and 
data  used  during  Fiscal  Year  1977  in  five  major  Federal 
grant-in-aid  programs;  General  Revenue  Sharing  (GRS),  the 
Comprehensive  Employment  and  Training  Act  (CETA), 
Community  Development  Block  Grants  (CDBG),  Aid  to 
Families  with  Dependent  Children  (AFDC),  and  Title  I  of  the 
Elementary  and  Secondary  Education  Act  (ESEA).  The  case 
studies  were  used  in  the  preparation  of  a  report,  which  includes 
a  discussion  of  general  features  of  allocation  formulas;  an 
analysis  of  how  they  may  fail  to  meet  their  objectives,  findings, 
and  recommendations. 

The  formulas  studies  have  an  essential  similarity:  They  all  are 
decision  rules  for  transferring  Federal  revenues  to  a  large 
number  of  localities,  and  the  rules  use  statistical  data,  either 
collated  from  administrative  records  or  based  on  statistical 
surveys.  The  formulas  have  many  structural  characteristics  in 
common;  except  for  AFDC,  they  all  rely  on  data  from  the  1970 
Census  of  Population  and  Housing  which  are  updated  in  some 
cases.  The  formula  designers  thus  confronted  similar  sets  of 
problems,  which  were  resolved  in  a  number  of  different  ways. 
Some  of  the  differences  among  the  formulas  can  be  attributed 
to  characteristics  of  the  programs,  i.e.,  the  range  of  services 
which  the  program  funds  may  be  used  to  purchase,  the  number 
of  jurisdictions  eligible  for  assistance,  whether  the  program  is 
meant  to  ameliorate  gradually  a  long  term  problem  or  to 
respond  rapidly  to  a  short  term  crisis.  On  the  other  hand,  other 
formula  differences  seemed  independent  of  program  differences, 
and  can  be  thought  of  as  independent  provisional  solutions  to  a 
number  of  formal  and  practical  problems  in  program  design. 

The  identification  of  common  problems,  and  of  their 
alternative  solutions,  was  one  of  the  most  interesting  aspects  of 
our  work.  It  helped  us  to  define  a  number  of  general  issues,  and 
also  to  make  recommendations  about  the  wisdom  or  unwisdom 
of  particular  procedures.  The  experience  of  subcommittee 
members  with  programs,  other  than  the  five  under  scrutiny,  was 
very  helpful  in  this  area. 

We  organized  our  thinking  around  a  paradigm-each  formula 
relates  the  money  received  by  a  locality  to  a  measure  which 


indicates  the  "Need"  for  the  kind  of  service  sponsored  by  the 
particular  program.'  The  relation  between  the  funding  level  and 
the  measure  of  "Need"  is  often  modified  by  what  we  called 
"Capability"  and  "Effort,"  as  well- as  by  constraints  of  various 
kinds.  While  the  paradigm  had  to  be  applied  flexibly-particular 
features  of  formulas  can  often  be  .interpreted  in  different 
ways— it  proved  to  be  a  useful  way  to  classify  formula  elements. 
In  the  section  to  follow,  I  will  present  a  personal  synthesis  of 
the  problems  we  discussed.  While  my  comments  arc  heavily 
influenced  by  the  findings  and  opinions  of  various  sub- 
committee members,  they  arc  not  to  be  taken  as  an  official 
view.  In  particular,  my  strong  emphasis  on  the  political 
implications  of  funding  issues  might  well  be  thought  inappro- 
priate by  other  members.  In  the  last  part  of  the  paper,  I  list  and 
explain  the  actual  recommendations  of  the  subcommittee.  For 
more  detail,  interested  readers  are  referred  to  the  Report.^ 

Problems 

This  discussion  of  formula  problems  is  organized  around  four 
themes:  Problems  related  to  data  quality,  problems  of  appro- 
priateness of  available  data,  problems  of  formula  design,  and 
problems  of  authority  and  responsibility  for  statistical  issues.  I 
perceive,  the  first  three  issues  as  largely  but  not  entirely 
technical,  and  the  fourth  issue  has  important  political  and 
administrative  implications. 

1 .   Data  quality— how  good  are  the  numbers? 

There  is  no  need  to  discuss  to  an  audience  of  statisticians 
why  our  data  series  do  not  count  whatever  it  is  they  are 
supposed  to  count.  The  analysis  and  reduction  of  bias  and 
variance  are  part  of  the  statistician's  everyday  life.  Even 
economists,  like  myself,  are  occassionally  concerned  about 
"errors  of  observation."  Instead  of  recapitulating  the 
familiar,  I  will  try  to  explain  how  these  issues  are  placed  in 
an  unfamiliar  context  when  data  are  used  in  the  allocation  of 
funds. 

The  key  issue  is  that  we  no  longer  are  looking  for  the 
"best"  numbers,  but  for  the  "fairest"  or  "best"  allocation. 
Survey  design  has  implications  for  public  policy,  and  must  be 
judged  accordingly. 

Let  me  give  some  examples  of  possible  issues  in  this  area. 
Data  with  standard  errors  over  a  given  percent  might  be 
judged  unacceptable  for  formula  use.  Large-scale  funding 
programs  might  be  required  to  include  statistical  overhead,  in 
order  to  support  adequate  data  collection.  We  need  to  clarify 
the  legal  status  of  data  known  to  be  incorrect  for  a  particular 
area,  but  generated  by  an  unbiased  procedure. 


'  Economists  on  the  subcommittee  objected  to  the  loose  use  of  the 
term  "Need,"  for  reasons  inherent  in  modern  demand  theory,  but  agreed 
to  accept  the  usage  provided  the  term  was  capitalized  and  caveated. 

^The  "Report  on  Statistics  for  Allocation  of  Funds",  Statistical 
Working  Paper  1.  Office  of  Federal  Statistical  Policy  and  Standards,  is 
currently  available  from  H.  Hyatt,  Office  of  the  Assistant  Secretary  for 
Planning  and  Evaluation,  HEW. 
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A  number  of  such  issues  may  arise  when  data  from  the 
1976  Survey  of  Income  and  Education  are  considered  for  use 
in  the  formula  for  Title  I,  ESEA.  The  survey  was  com- 
missioned to  generate  counts  of  children  in  poverty  at  the 
State  level,  currently  Title  I  uses  such  counts  from  the  1970 
census.  Because  of  cost  limitations,  the  SIE  produced  counts 
with  coefficients  of  variation  of  approximately  10  percent, 
or  95  percent  confidence  intervals  of  approximately  ±  20 
percent.  The  variance  will  certainly  be  used  as  an  argument 
against  the  adoption  of  data  from  the  survey  presumably  by 
the  losing  States. 

The  subcommittee  report  includes  a  great  deal  of  analysis 
of  problems  of  data  quality.  A  section  of  the  third  chapter 
describes  how  issues  of  cost,  bias,  and  variance  enter  into  the 
planning  for  the  1985  census,  and  into  strategies  for 
minimizing  the  census  undercount.  Regression  techniques  are 
advocated  as  a  way  of  using  low-bias  high-variance  sample 
data  to  correct  the  bias  of  statistics  derived  from  adminis- 
trative records. 

In  addition,  paper  by  Tom  Jabine,  appended  to  the 
report,  deals  with  issues  of  equity  and  sampling  design  which 
apply  to  any  survey  which  is  intended  to  determine  the 
number  of  residents  of  each  State  who  have  a  particular 
characteristic.  For  example,  the  1976  Survey  of  Income  and 
Education  was  designed  to  provide  counts  by  State  of 
children  in  poverty. 

Two  problems  related  to  data  quality  seem  endemic  to  all 
formula  allocation  programs-timeliness  and  allocation  to 
small  areas.  In  the  decennial  census,  relatively  accurate  data 
are  produced  for  small  areas,  but  the  quality  of  small  area 
data  (including  State  data)  is  limited  in  intercensal  years. 
What  kinds  of  intercensal  surveys  to  take?  what  kinds  of 
adjustment  of  census  data  are  appropriate?  should  estimates 
from  the  current  population  survey  be  used?  and  how  funds 
should  be  allocated  to  small  areas.  Each  program  we  studied 
embodied  decisions  in  these  areas,  and  there  was  considerable 
variation  among  the  programs. 

The  countercyclical  thrust  of  CETA,  particularly  titles 
II  and  VI,  require  data  as  current  as  possible.  These  titles 
of  CETA  target  to  "areas  of  substantial  unemployment," 
which  are  not  necessarily  contiguous  with  political  juris- 
diction. Not  surprisingly,  this  program  depends  more  than 
the  others  on  relatively  tenuous  estimation  procedures. 

GRS  allocates  by  formula  to  all  jurisdictions.  It 
allocates  first  to  States  by  a  complex  procedure,  which 
utilizes  two  formulas;  then  subslaie,  by  a  formula  which 
requires  less  data.  It  has  been  difficult  to  produce  with 
acceptable  accuracy  for  all  areas  even  the  few  data 
elements  used  in  substate  allocation.  GRS  allocations 
utilize  updates  of  several  census  series. 

CDBG  allocates  by  formula  only  to  relatively  large 
jurisdictions,  such  as  urban  counties  and  metropolitan 
cities.  Competitive  grants  are  used  for  smaller  areas.  The 
CDBG  formula  uses  updated  population  counts,  but  1970 
uses  poverty  counts. 

ESEA  allocates  by  formula  to  counties.  States  allocate 
county  grants  to  school  districts,  using  subcounty  alloca- 


tion procedures  consistent  within  each  State  and  subject 
to  Federal  guidelines.  Census  data  are  not  updated,  but 
counts  of  AFDC  children,  and  children  in  other  cate- 
gories, are  collected  each  year.  These  counts,  derived  from 
administrative  records,  are  sometimes  claimed  to  update 
the  formula. 

AFDC  uses  current  caseload  data  to  determine  the  base 
on  which  reimbursement  is  computed.  The  reimbursement 
rate  depends  on  State  per  capita  income,  updated  by  the 
Department  of  Commerce,  and  averaged  over  3  years. 

(2)    Appropriateness    of    measures— what    do    the   numbers 
mean? 

One  important  implication  of  the  "Need,  Capability,  and 
Effort"  paradigm  is  that  the  counts  used  in  funding  formulas 
are  indicators  meant  to  represent  things  other  than  them- 
selves. In  contrast,  this  is  not  true  of  population  counts  used 
for  apportionment  purposes.  For  several  of  the  programs  we 
studied,  the  appropriateness  of  the  "Need"  measure  has  been 
a  matter  of  dispute,  and  measures  of  "Capability"  and 
"Effort"  are  if  anything  more  problematic. 

The  use  of  poverty  counts  rather  than  counts  of 
low-achieving  children  has  become  controversial  for  Title 
I.  Many  studies  have  questioned  the  appropriateness  of 
measures  expressed  in  monetary  terms,  when  these  are 
unadjusted  for  local  price  differences.  Per  capita  income 
as  a  "Capability"  measure,  or  as  a  base  on  which  "Effort" 
is  calculated,  is  also  criticized  for  insensitivity  to  varia- 
tions in  fiscal  burden. 

The  general  problem,  that  only  a  limited  range  of  data 
series  can  feasibly  be  collected,  is  exacerbated  by  the  peculiar 
requirements  of  funding  formulas,  which  require  measures 
which  are  consistently  defined  and  collected  for  all  the 
jurisdictions  which  may  have  a  claim  to  a  share  of  the 
funding.  This  further  restricts  the  choice  of  data  series.  The 
problem  can  be  somewhat  alleviated  by  the  use  of  different 
data  series  for  allocation  within  different  areas,  e.g.,  different 
series  for  substate  allocation  within  different  States. 

The  problem  of  appropriateness  goes  beyond  mere  data 
limitations.  There  are  two  more  subtle  issues.  First,  Congress 
is  reluctant  to  define  program  objectives  very  precisely.  This 
may  make  moot  the  whole  question  of  appropriateness 
though  it  docs  not  eliminate  it.  A  program  without  clearly 
stated  objectives  may  still  have  objectives.  Second,  because 
localities  have  considerable  discretion  over  the  use  of  the 
funding,  a  true  measure  of  formula  parameters  would  have  to 
take  into  consideration  how  the  funding  is  used  in  different 
places.  Each  of  these  points  will  be  explained  in  turn. 

In  general,  the  statement  of  purposes  attached  to  the 
authorization  of  a  grant-in-aid  program  is  very  broad.  One 
cannot  deduce  a  particular  funding  arrangement  for  such 
statements.  The  funding  objectives  arc  usually  stated  only  in 
the  prcsentalion  of  the  formula.  That  is,  the  decision  rule  for 
giving  oul  the  money  is  the  only  clear  statement  of  how 
Congress  wishes  the  month  to  be  allocated.  There  is  no 
attempt   to  provide  a  consistent,  binding  rationale  for  the 
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formula  against  which  the  actual  performance  can  be 
measured.  There  arc  many  perfectly  goods  reasons  for  this 
state  of  affairs.  The  unfortunate  result,  however,  is  that  the 
formula  actually  chosen  and  the  measures  used  in  the 
formula  are  frequently  regarded  as  the  immaculately  con- 
ceived manifestation  of  legislative  will,  until  amended.  While 
it  may  be  good  law  to  say  that  Congress  meant  to  do 
precisely  what  it  did,  perusals  of  the  legislative  history  often 
reveal  that  Congress  chose  a  particular  measure  with  reluc- 
tance, only  because  a  better  one  was  not  available.  In  1974, 
dissatisfaction  with  measures  of  poverty  available  for  use  in 
ESEA  funding  led  to  a  series  of  mandated  studies  aimed  at 
improving  the  concept.  In  short,  the  absence  of  a  clear 
statement  of  funding  objectives,  and  the  presence  of  a  clear 
statement  of  the  formula,  are  not  reason  to  deduce  that  the 
formula  meets  the  objectives. 

Appropriateness  of  formula  measures  is  related  to  local 
behavior  because  the  "Need"  for  a  program  depends  on  what' 
that  program  is.  If  one  area  uses  community  development 
funds  to  build  low  cost  housing,  another  to  build  swimming 
pools,  and  a  third  as  tax  relief  (using  the  funds  to  support 
projects  it  would  have  funded  locally  in  the  absence  of 
Federal  aid),  it  is  hard  to  imagine  a  measure  of  "Need"  for 
the  program  which  has  more  than  a  vague  plausibility.  In 
general,  the  broader  the  uses  to  which  funding  can  be  put, 
the  more  difficult  it  is  to  justify  any  particular  criteria  of 
"Need,"  "Effort,"  or  even  "Capability." 

There  are  three  kinds  of  problems  related  to  the  matching 
of  formula  elements  to  program  objectives:  The  limited 
number  of  data  series  which  can  be  used;  a  reluctance  on  the 
part  of  Congress  fully  to  specify  its  objectives;  the  over- 
simplification involved  when  a  single  measure,  or  a  small  set 
of  measures,  are  used  to  represent  concepts  which  are  very 
complex,  and  which  may  properly  have  different  meanings  in 
different  places. 

My  personal  opionion  is  that  this  set  of  problems  has  very 
powerful  implications  for  formula  allocation.  First,  the 
selection  of  measures  is  so  problematic  that  it  is  as  much  a 
political  procedure  as  a  scientific  one.  While  the  choice  of 
particular  measures  is  often  defended  on  quasi-scientific 
grounds,  the  motive  behind  the  choice  may  well  be  a 
distributional  one:  From  a  number  of  measures  which  can  be 
equally  defended  as  "appropriate,"  that  measure  is  chosen 
which  best  suits  the  distributional  objectives  of  the  policy- 
makers in  control  of  the  process.  An  example  of  this  from 
State  school  finance  is  the  choice  between  average  daily 
attendance  (ADA)  and  average  daily  membership  (ADM)  as 
the  enrollment  base  on  which  State  per  pupil  aid  is  to  be 
calculated.  The  choice  between  ADA  and  ADM  is  often  made 
according  to  which  districts  it  is  desired  to  favor,  that  is,  areas 
with  relatively  low  daily  attendance  prefer  ADM,  and  vice 
versa.  Yet  the  arguments  for  ADA  and  ADM  are  often 
supported  by  allegations  that  one  or  the  other  better 
represents  the  true  relationship  between  enrollment  and 
cost-as  though  anyone  really  knows  about  this,  or  indeed 
cares. 


A  second  implication  is  that  the  use  of  measures  to  achieve 
funding  goals  is  likely  to  affect  the  measures  themselves.  For 
example,  the  heavy  use  of  per  capita  income  and  poverty 
counts  in  funding  formulas  has  created  strong  pressures  for 
the  adjustment  of  these  measures  to  reflect  intcrarca  price 
level  differences  or  cost-of-living  differences,  which  are  more 
complex.  The  pressure  is  healthy— certainly  social  measures 
should  be  responsive  to  public  concern  about  how  they  are 
defined.  Yet  there  are  obvious  dangers  in  the  overtly  political 
nature  of  this  process— the  fact  that  changes  in  measures  may 
be  supported  by  those  who  stand  to  gain  by  the  change,  and 
may  be  obstructed  by  those  who  stand  to  lose. 

It  is  generally  agreed  that  Congress  has  the  right  to  create 
whatever  formula  it  likes.  It  is  also  generally  agreed  that  the 
numbers  put  into  that  formula  must  be  the  right  numbers,  as 
accurately  defined  and  measured  as  possible.  We  seem  to  be 
in  a  position  in  which  both  the  outcome  and  the  process  are 
judged,  by  independent  standards  of  fairness.  Who  sets  these 
standards,  how  they  are  to  be  enforced,  and  the  proper 
relation  among  the  standards  are  all  issues  that  are  as  yet 
murky.  Currently,  Congress  has  virtually  unlimited  power  in 
this  area,  but  one  can  imagine  that  the  courts  may  become 
heavily  involved. 

(3)  Problems  of  formula  construction— how  do  the  pieces  fit 
together? 

Up  to  now,  we  have  been  on  relatively  familiar  ground- 
issues  of  data  relaibility  in  the  first  section,  and  data  validity 
in  the  second.  Formula  construction  is  an  issue  with  a  long 
history— as  any  expert  on  State  school  aid  procedures  can  tell 
you— but  a  relatively  unfamiliar  one.  The  fiscal  federalism 
literature  is  replete  with  comparisons  of  the  effectiveness  of 
closed  end  and  open  end  matching  grants,  as  opposed  to 
block  grants,  but  these  technical  questions  carry  us  only  so 
far.  In  the  words  of  Gramlich: 

At  some  point  research  will  have  to  go  beyond  the 
question  everybody  asks  and  answers— how  much  did  local 
spending  change-and  into  some  more  basic  questions, 
such  as  whether  the  right  type  of  spending  was  en- 
couraged, by  the  right  governments,  for  the  right  citizens, 
and  even  the  ultimate  one  of  whether  the  spending 
accomplished  its  ultimate  objectives.^ 

As  the  discussion  of  "Need"  measures  in  the  previous 
section  pointed  out,  until  we  fill  major  gaps  in  our 
understanding  of  how  grant  funding  is  transformed  into 
useful  spending,  and  what  the  effects  of  the  spending  are,  the 
entire  process  of  formula  construction  is  arbitrary.  This  is 
true  even  if  we  assume  that  the  desire  of  Congress  is  to  design 
the  formula  that  best  meets  the  "Need,"  rather  than  to 
design  a  formula  that  meets  the  "Need"  but  also  meets  a 
number  of  unrelated  funding  objectives. 


'Edward  M.  Gramlich,  "Intergovernmental  Grants:  A  Review  of  the 
Empirical  Literature",  prepared  for  The  International  Seminar  on  Public 
Economics  Conference,  Berlin,  January  1976. 
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In  view  of  this  skepticism  about  the  possibility  of 
constructing  formulas  which  maximize  program  objectives, 
we  suggested  a  paradigm  for  formula  building,  consistent  with 
program  objectives,  and  also  consistent  with  Congress's 
notion  of  an  appropriate  distribution  of  the  funds: 

a.  The  definition  of  "Need,"  "Capability,"  and 
"Effort,"  and  the  preliminary  choice  of  measures  of  these 
concepts; 

b.  Some  specification  of  the  desired  properties  of  the 
allocation  formula-how  the  allocation  is  to  vary  with 
"Need;"  how  this  basic  relation  is  to  be  modified  by 
variation  in  "Capability"  and  "Effort;"  how  much  year- 
to-year  fluctuation  in  allocations  should  be  permitted; 
whether  other  factors  should  influence  the  allocation,  and 
to  what  extent. 

c.  Technical  construction  of  the  formula,  involving 
choice  of  formula  class  (additive  or  multiplicative), 
weighting  of  elements,  selection  of  constraints  on  the 
values  of  variables  entered  into  the  formula,  and  selection 
of  constraints  of  the  actual  allocation. 

Matching  the  formula  to  the  specification  is  partly  a 
mathematical  exercise,  but  other  important  considerations 
include  comprehensibility,  the  desired  level  of  complexity, 
and  attempts  to  predict  and  bound  the  dynamic  behavior  of 
the  formula. 

By  comprehensibility,  we  mean  the  important  con- 
sideration that  laymen  be  able  to  understand  how  the 
formula  works,  so  that  policy  debate  is  meaningful;  and  that 
recipients  be  able  to  understand  how  their  own  allocations 
were  derived,  so  that  errors  in  data  or  in  calculation  can  be 
discovered  and  corrected. 

By  level  of  complexity,  we  mean  the  recognition  that  the 
world  cannot  be  simulated  in  an  allocation  formula,  and,  that 
it  must  be  decided  how  much  attempt  to  mirror  "reality"  is 
feasible  and  desirable. 

By  dynamic  behavior,  we  mean  the  changes  in  formula 
elements  through  time  and  the  effect  that  these  may  have  on 
resulting  allocations.  This  year's  good  proxy  may  be  next 
year's  disaster. 

The  value  of  this  paradigm  lies  not  in  its  use  as  a  standard 
operating  procedure-though  it  might  be  a  good  idea  to 
develop  formulas  in  this  quasi-rational  manner-but  as  an 
illustration  of  the  enormous  number  of  decisions  which  are 
made,  explicitly  or  implicitly,  when  a  formula  is  actually 
chosen.  It  is  no  wonder  that  formula  debates  are  so 
confusing.  All  kinds  of  arguments  are  jumbled  up  together, 
arguments  about  the  reliability  and  validity  of  data, 
weighting,  updating,  and  bounding  the  allocation.  Even  if 
program  design  were  purely  a  technical  matter,  it  would 
require  the  skills  of  extremely  gifted  and  imaginative 
technicians. 

4.  Problems  of  authority  and  responsibility. 

Of  course,  formula  selection  is  not  purely  a  technical 
problem,  not  even  primarily  a  technical  problem.  Formulas 


are  used  to  handle  a  basic  political  issue,  i.e.,  how  will  the 
money  be  divided?  By  turning  debates  over  jurisdictions  into 
debates  over  parameters,  formula  conflicts  in  some  ways  cool 
the  heat  of  debates  over  the  spoils  of  politics,  but  in  some 
ways  they  merely  deflect  the  heat.  To  change  the  metaphor, 
formula  debates  arc  less  nakedly  political  than  debates  over 
the  location  of  military  bases,  but  this  is  not  to  say  that  the 
emperor  really  has  clothes  on.  For  all  the  talk  about  the 
proper  measurement  of  "Need,"  the  bottom  line  is  often  the 
allocation  to  region.  State,  and  district  and  the  major  impact 
of  formulas  may  be  to  give  extra  power  to  the  coalitions  that 
can  best  maneuver  the  simulations  and  printouts  as  the 
formula  is  debated. 

This  jaundiced  opinion  is  not  in  any  sense  a  critique  of 
Congress.  As  I  think  I  have  made  clear,  it  is  my  opinion  that 
in  the  absence  of  better  theory,  better  data,  and  better 
knowledge  about  how  grant-in-aid  funds  are  used,  it  is 
appropriate  that  allocation  procedures  in  a  world  of  formulas 
be  as  freewheeling  and  political  as  allocation  procedures  in  a 
world  without  formulas.  And,  of  course,  the  political  arena 
is  the  appropriate  place  for  the  setting  of  policy. 

This  is  what  my  colleagues  may  have  meant  when  they 
stated  in  the  first  paragraph  of  the  Introduction  to  chapter  I 
of  the  subcommittee  report: 

"For  the  purpose  of  this  study,  it  is  assumed  that 
whatever  Congress  specifies  in  the  authorizing  legislation 
for  a  grant-in-aid  program  on  the  manner  of  allocation  of 
Federal  funds  is  in  principle  an  equitable  distribution, 
although  anomalous  and  unanticipated  results  may  emerge 
in  some  instances." 

I  interpret  this  statement  to  mean  that  it  is  the  role  of 
Congress  to  design  formulas,  and  the  role  of  the  statistical 
community  to  identify  and  discourage  practices  leading  to 
anomalous  and  unanticipated  results,  to  set,  as  it  were,  some 
rules  of  chivalry  to  govern  the  trial  by  battle.  Everyone  who 
works  in  this  area  has  some  horror  story  even  though  we 
know  that  Congress  has  the  right  to  do  anything,  why  on 
earth  did  they  decide  to  do  thatl  just  making  up  a  list  of  the 
that's  would  be  a  useful  undertaking;  the  subcommittee 
contributed  a  few,  and  it  recommended  that  program 
designers  be  informed  of  undesirable  formula  practices. 

In  a  more  positive  vein,  there  is  evidence  that  Congress 
is  interested  in  improving  the  quality  of  its  formulas  and  of 
the  underlying  data.  Two  good  examples  are  the  funding  of 
the  study  of  the  measure  of  poverty,  and  th^plans  for  the 
1985  census.  Clearly,  whatever  its  antecedents,  formula 
allocation  is  developing  into  a  process  whose"  legitimacy 
depends  on  some  notion  of  fairness.  Considerable  thought 
needs  to  be  given  to  the  precise  division  of  responsibility  in 
this  area.  Despite  the  natural  timidity  of  statisticians  on 
matters  of  funding  policy,  statistical  policy  and  funding 
policy  are  intertwined,  and  some  way  must  be  found  to 
reconcile  the  expertise  and  concerns  of  the  statisticians  with 
the  policy  responsibility  of  the  program  drafters  in  the 
legislative    and    executive    branches.    It   is   because   of  the 
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importance,  and  the  difficulty,  of  working  out  an  appro- 
priate modus  operandi,  that  the  subcommittee's  first 
recommendations  were  procedural.  We  now  turn  to  the 
recommendations. 

Recommendations  of  the  Subcommittee 

The  subcommittee  made  nine  recommendations,  which  I  will 
present  or  describe  in  this  section.  A  number  of  them  follow 
directly  from  considerations  raised  earlier  in  this  paper,  several 
require  a  bit  more  explanation. 

The  first  recommendation  refers  to  the  specification  of  pro- 
gram goals,  and  of  funding  procedures: 

"That  program  goals  be  specified  as  clearly  and  com- 
pletely as  possible  in  the  statement  of  purpose  of  each 
grant-in-aid  act  and  that  program  drafters  guard  against 
over  specification  of  statistical  data  and  procedures." 

As  noted,  there  is  a  natural  tendency  for  Congress  to  express 
its  goals  in  general  terms,  and  then  to  describe  the  formula 
precisely,  even  so  far  as  to  prescribe  particular  data  series  or  the 
manner  of  collection  of  administrative  data.  The  specification 
of  particular  procedures  may  prevent  the  development  of  better 
ones  and  the  lack  of  goal  specification  may  leave  administrators 
without  effective  guidelines  when  decisions  must  be  made.  The 
appropriate  level  of  specificity  of  goals  and  procedures  is  by  no 
means  a  simple  matter,  in  practice,  it  should  probably  be 
worked  out  by  statisticians  and  program  designers  working 
together. 

The  second  recommendation  is: 

"That  provision  be  made  for  an  active,  continuous 
interface  between  legislative  program  drafters  and  the 
statistical  community.  This  interface  is  perceived  as 
necessary  to  the  execution  of  the  previous  recommenda- 
tion. Many  program  drafters  regularly  consult  with 
statistical  staffs  about  the  feasibility,  the  likely  effects, 
and  the  possible  difficulties  of  various  formula  pro- 
cedures. What  is  recommended  is  a  standard  operating 
procedure." 

The  third  recommendation  refers  to  the  testing  and  moni- 
toring of  formulas: 

"That  statistical  and  program  agencies  provide  to  program 
drafters  an  analysis  of  the  sensitivity  overtime  of  proposed 
formulas  and  of  the  statistics  they  incorporate  so  that 
possible  effects  on  allocations  can  be  anticipated.  Also, 
that  provisions  be  made  for  testing,  monitoring,  and 
assessing  by  program  agencies  of  the  performance  of  each 
specific  formula  or  allocation  rule  prior  to  enactment." 

In  other  words,  before  Congress  mandates  a  formula,  it 
should  have  available  a  careful  analysis  of  its  likely  impact,  both 
in  the  current  year  and  in  the  future.  The  value  of  such  an 
analysis  is  fairly  obvious,  but  it  is  also  clear  that  legislative 
negotiation  has  led  to  a  number  of  formulas  whose  properties 
were  only  barely  understood  at  the  time.  Because  of  the  natural 


reluctance  to  change  a  funding  rule  once  in  place,  it  is 
particularly  important  that  difficulties  with  existing  formulas  be 
identified  as  rapidly  as  possible.  Note  that  goal  statements  (see 
recommendation  1)  are  essential  to  a  diagnosis  that  a  formula  is 
misbehaving. 

The  fourth  recommendation  is: 

"That  legislative  drafters  and  program  designers  be  advised 
of  data  problems  and  the  existence  of  statistical  pro- 
cedures, as  exemplified  in  the  five  case  studies,  which  may 
lead  to  formulas  with  consequences  that  are  generally 
recognized  as  undesirable." 

One  practice  mentioned  is  the  freedom  in  CETA  given  to 
areas  of  substantial  unemployment  (ASU)  to  define  their  own 
boundaries.  This  appears  to  lead  to  gerrymandering  of  ASU's.  It 
certainly  creates  substantial  problems  in  the  generation  data  for 
the  ASU's.  Another  practice,  which  received  an  entire 
recommendation  is  the  use  of  eligibility  cutoffs  (see  the  ninth 
recommendation). 

The  fifth  recommendation  is: 

"To  initiate  a  limited  program  in  research  and  develop- 
ment in  formula  design,  to  deal  with  many  of  the  issues 
raised  in  the  report." 

The  sixth  recommendation  is: 

"The  designation  of  a  limited  number  of  additional 
official  statistical  series  for  use  in  funds  allocation.''  These 
official  series  could  be  of  higher  quality  than  other  series, 
and  their  designation  would  discourage  attempts  to  make 
the  choice  among  similar  series  a  political  decision. 
Presumably  the  official  series  could  be  adjusted  for 
formula  use  if  desired." 

The  seventh  recommendation  refers  to  data  comparability. 

"One  of  the  subcommittee  findings  was  that  choice  of 
data  for  small  areas  may  be  unreasonably  limited  if  it  is 
required  that  only  nationally  consistent  data  series  be 
used.  Better  allocations  would  result  in  some  cases  if  good 
State  data  were  used  for  allocations  within  that  State,  or 
good  county  data  within  that  county.  The  recom- 
mendation is  for  policy  flexibility  in  substate  allocation, 
subject  to  specific  Federal  statistical  and  administrative 
guidelines. 

The  eighth  recommendation  is: 

"That  since  data  errors  are  inevitable  and  since  statistical 
resources  are  necessarily  limited,  priority  be  given  to 
minimizing  the  very  large  errors  which  may  occur  in  data 
used  for  the  allocation  of  funds." 

This  in  particular  refers  to  errors  for  large  areas,  even  small 
relative  errors  for  large  areas  may  have  a  large  impact  on  the 
distribution  of  funds. 


*  As  the  report  indicates,  Circular  A-46,  "Standards  and  Guidelines  for 
Federal  Statistics"  designates  official  scries  for  total  population,  labor 
force,  unemployment,  and  poverty. 
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The  ninth  recommendation  is: 

"A  specific  suggestion  for  the  problem  of  eligibility 
cutoffs— that  data  errors  in  the  vicinity  of  the  cutoff  level 
of  the  criterion  variable  may  have  very  substantial  effects 
on  the  funds  allocated  to  the  affected  area.  The 
recommendation  is  for  a  gradual  transition  from  receiving 
no  allocation  to  receiving  the  full  formula  amount." 


If  the  recommendations  of  the  subcommittee  are  carried  out, 
it  is  clear  that  many  new  procedures  will  be  developed  in  the 
area  of  formula  design.  These  procedures  should  help  to  avoid  a 
number  of  crises  that  regularly  surface  in  the  world  of  formula 
allocation.  I  also  look  forward  to  the  implementation  of  the 
research  agenda,  which  should  produce  results  of  great  practical 
and  theoretical  interest. 


Coleman 
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Personal  Income:   Some  Observations 
on  its  Construction,  Uses  and  Adequacy 
as  a  Subnational  Income  Measure 

Edwin  J.  Coleman 

Bureau  of  Economic  Analysis 

The  purpose  of  this  paper  is  lo  outline  the  method  of 
construction  of  the  regional  economic  measures  produced  by 
the  Bureau  of  Economic  Analysis  (BEA).  There  will  be  a 
discussion  on  the  uses  of  the  estimates  within  BEA  and  also 
comments  on  some  of  the  uses  of  BEA's  regional  income  and 
employment  estimates  by  other  users,  both  private  and  govern- 
mental. There  will  also  be  comments  on  the  sources  of  materials 
used  to  construct  the  BEA  estimates  and  will  describe  one  of 
the  more  important  administrative  record  sources  used  in 
producing  the  personal  income  estimates.  In  closing,  an  effort 
will  be  made  to  touch  on  the  problem  of  measuring  errors  in 
local  area  income  and  employment  estimates. 

Personal  Income  Defined 

Personal  income  as  defined  by  BEA  is  the  current  income 
received  by  the  residents  of  an  area  from  all  sources.  It  is 
measured  after  deduction  of  personal  contributions  to  Social 
Security,  government  retirement  plans,  and  other  social  in- 
surance programs,  but  before  deduction  of  income  and  other 
personal  taxes.  It  includes  income  received  from  business, 
Federal,  State,  and  local  governments,  households,  institutions, 
and  foreign  governments.  It  consists  of  wages  and  salaries 
(including  executive  salaries,  bonuses,  commissions,  payments  in 
kind,  incentive  payments,  and  tips)  supplementary  earnings 
termed  "other  labor  income"  (predominantly  employer  contri- 
butions to  private  pension,  health,  and  welfare  funds).  The  net 
incomes  of  owners  of  unincorporated  businesses  (farm  and 
nonfarm,  with  the  latter  including  the  incomes  of  independent 
professionals),  net  rental  income,  royalties,  dividends,  interest, 
and  government  and  business  transfer  payments  (consisting,  in 
general,  of  disbursements  to  persons  for  which  no  services  are 
rendered,  such  as  unemployment  benefits.  Social  Security 
payments,  medicare  benefits,  retirement  pay  of  governmental 
programs,  welfare,  and  relief  payments). 

Residence  Versus  Place  of  Work 

For  the  measurement  of  personal  income  on  a  regional  basis, 
BEA  assigns  the  income  received  to  the  State,  county,  or  SMSA 
in  which  the  individual  resides.  However,  BEA  also  presents 
labor  and  entrepreneurial  income  in  industrial  detail  by  place  of 
work    since    the   bulk    of   labor'    and    proprietors'    income   is 


'  Labor   income   is  the  sum  of  wages  and   salaries   plus  other  labor 
income. 


reported  by  industry  at  the  point  of  disbursement  or  establish- 
ment location.  The  income  is  then  adjusted  to  a  place-of- 
rcsidencc  basis.  In  the  past,  this  adjustment  has  been  made  at 
the  all-industry  level.  However,  as  a  part  of  the  benchmark 
revisions  currently  in  progress,  the  residence  adjustment  is  now 
being  made  separately  for  broad  industrial  sectors.  Thus,  the 
differing  rates  of  growth  of  the  industrial  sectors  will  be  more 
precisely  reflected  in  the  residence  adjustment  calculations. 

Regional  measures  of  labor  and  proprietors'  income  (some- 
times referred  to  as  earnings)  are  important  on  both  a 
place-of-work  and  a  place-of-residence  basis.  The  estimates  on 
the  place-of-work  or  where-earned  basis,  which  include  detail  by 
industry,  are  useful  in  the  analysis  of  the  industrial  structure  of 
a  given  area.  The  estimates  when  aggregated  and  converted  to  a 
place-of-residence  or  where-received  basis,  are  useful  for  the 
analysis  of  consumer  markets  and  purchasing  power. 

Estimates  for  the  other  components  of  personal  income 
dividends,  personal  interest  income,  rental  income  of  persons, 
and  transfer  payments  are  made  on  a  where-received  basis  only. 
In  the  case  of  transfer  payments,  there  is  no  economic  relevance 
to  the  where-earned  concept  and  dividends,  interest,  and  rent 
are  not  estimated  on  a  where-earned  basis  because  of  the  lack  of 
data  suitable  for  assigning  such  income  to  the  areas  in  which  it  is 
generated. 

Personal  contributions  for  social  insurance,  which  are  not 
included  in  personal  income  but  are  explicitly  deducted  from 
total  labor  and  proprietors'  income,  are  also  subject  to  residence 
adjustment.  Since  contributions  to  most  social  insurance  pro- 
grams are  withheld  at  the  point  of  disbursement,  they  are 
estimated,  like  wages  and  salaries,  on  a  place-of-work  basis. 
Accordingly,  most  personal  contributions  are  adjusted  to 
correspond  to  the  place  of  residence  concept  of  the  personal 
income  estimates.  Several  contribution  items  are  obtained  by 
place  of  residence  (the  most  important  of  which  are  contri- 
butions of  the  self-employed  and  premium  payments  for 
government  life  insurance)  and  therefore,  do  not  require 
adjustment. 

Scope  of  Regional  Measurement  Program 

The  regional  income  and  employment  statistics  published  by 
BEA's  Regional  Economic  Measurement  Division  offer  a  long 
term  quantitative  description  of  economic  developments  in  each 
State,  SMSA,  and  county  that  can  be  related  directly  to  those 
on  the  national  or  regional  levels.  Currently,  BEA  is  producing 
annual  personal  and  money  income  estimates^  for  each  State, 
SMSA,  and  county  as  well  as  for  the  parishes  of  Louisiana, 
independent  cities  of  Virginia,  and  census  divisions  of  Alaska.  A 
detailed  personal  income  series  by  States  is  published  each  year 
in  the  August  issue  of  the  Survey  of  Current  Business  (SCB). 
Annual  estimates  are  available  for  the  years  1929  to  1976. 
Revised  estimates  of  State  personal  income  for  the  years 
1971-76  will   be   published   in  the  August  1977  SCB.  Revised 

'The  Slate  and  county  money  income  estimates  are  obtained  by 
modifying  personal  income  and  are  produced  by  BEA  for  use  by  the 
Census  Bureau  in  preparing  the  statistics  for  the  general  revenue  sharing 
formulas. 
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State  estimates  for  1958-70  will  be  presented  later  this  year.^  In 
addition  to  the  annual  State  estimates  of  personal  income 
published  in  the  August  SCB,  the  Bureau  of  Economic  Analysis 
publishes  quarterly  estimates  of  State  personal  income  in  the 
January,  April,  July,  and  October  issues.  Quarterly  State 
estimates  are  available  from  first  quarter  1969  to  first  quarter 
1977. 

County  personal  income  data  are  available  for  the  years 
1929,  1940,  1950,  1959,  1962,  and  for  1965-75.  Estimates  of 
per  capita  personal  income  by  State,  SMSA,  and  county  have 
been  prepared  for  the  same  years  as  the  aggregate  personal 
income  measures.  Breakdowns  of  personal  income  by  type  of 
payment,  labor,  and  proprietors'  income  (earnings)  by  broad 
industrial  sector  are  available  for  each  SMSA  and  county. 

In  addition,  BEA  makes  available  at  the  sub-State  level  a 
breakdown  of  full-  and  part-time  employment  by  type  and 
major  industry  group;  a  detailed  breakdown  of  transfer  pay- 
ments by  major  category,  a  tabulation  of  farm  proprietors' 
income,  and  expenses.  This  latter  component  of  personal 
income  continues  to  be  one  of  the  most  difficult  income  flows 
to  measure  because  of  its  volatility  and  a  lack  of  annual 
sub-Stale  data.  Farm  income  is  the  net  income  of  farm 
proprietors  and  is  equal  to  (and  derived  statistically  as)  the  gross 
income  of  all  farm  operators  minus  production  expenses  and 
adjusted  to  exclude  the  income  of  corporate  farms.  The 
concepts  underlying  the  BEA  county  estimates  of  farm  income 
are  the  same  as  those  used  for  the  national  and  State  farm 
income  estimates  prepared  by  the  U.S.  Department  of  Agri- 
culture. The  major  definitional  or  classificational  difference 
between  the  two  farm  series  is  that  the  USDA  totals  include, 
and  the  BEA  figures  exclude,  income  of  corporate  farms. 

G)nstruction  of  the  Personal  Income  Estimates 

Currently,  almost  400  series  of  separate  estimates  go  into 
the  derivation  of  the  line  items  shown  in  the  personal  in- 
come tables  for  States,  SMSA's,  and  counties.  This  detailed 
estimating  approach  provides  the  information  needed  to  audit 
both  the  sources  and  methods  of  estimation,  and  assures  that 
the  final  estimates  will  accurately  reflect  industry  and  com- 
ponent differentials  at  both  the  State  and  county  levels.  For 
example,  such  diagnostic  capability  is  especially  critical  when 
considering  the  use  of  an  income  series  for  formula  allocations 
since  one  must  select  a  measure  that  can  be  defended 
empirically  as  well  as  conceptually. 

The  bulk  of  the  source  materials  used  to  prepare  the 
estimates  is  culled  from  the  administrative  records  of  Federal 
and  Stale  government  programs,  with  the  remainder  of  the  data 
coming  from  the  various  censuses  and  nongovernment  sources. 
Several  of  the  more  important  sources  of  administrative  record 
information  include  data  generated  as  the  byproduct  of   the 

'  llie  (Jeflnjiiondl  and  cldssificationa!  revisions  incorporated  in  the 
Slate  estimates  arc  those  made  in  the  1976  benchmarl<  revision  of  the 
national  income  and  product  account  (NIPA)  estimates.  I  he  statist icil 
revisions  are  traceable  not  only  to  the  rebenchmarking  of  the  NIPA 
estimates  but  to  new  data  sources  and  Improved  methodology,  lor  a 
discussion  of  the  revisions,  sec  the  August  1977  Issue  of  the  SCI). 


State  unemployment  insurance  (Ul)  programs  of  the  Employ- 
ment and  Training  Administration,  the  insurance  programs  of 
the  Social  Security  Administration,  and  the  Federal  tax 
piogram  of  the  Treasury  Department.  Two  of  the  more 
important  censuses  utilized  are  the  censuses  of  agriculture  and 
population.  The  data  obtained  from  the  above  sources  yield 
more  than  90  percent  of  the  data  needed  for  the  preparation  of 
State  and  county  income  estimates. 

The  local  area  income  estimates  prepared  by  the  Regional 
Economic  Measurement  Division  are  tied  directly  to  BEA's 
official  estimates  of  personal  income  by  States."  That  is,  the 
State  total  for  each  income  component,  as  taken  from  the 
official  State  income  series  before  adjustment  for  residence,  is 
allocated  to  the  counties  of  the  State  in  accordance  with  each 
county's  proportionate  share  of  the  same  or  some  related  series 
that  is  available  on  a  county  basis.  For  some  income  items,  more 
information  is  available  for  SMSA's  than  for  individual  counties. 
In  these  instances,  State  totals  are  first  allocated  to  the 
individual  SMSA's  of  each  State  on  the  basis  of  a  particular 
allocator  and  the  remainder  assigned  to  all  non-SMSA  counties 
by  the  most  appropriate  allocator  available.  Prior  to  the  release 
of  the  1973  estimates,  it  was  not  possible  to  provide  county 
breakdowns  of  multicounty  SMSA's  due  to  a  lack  of  sufficient 
information.  However,  the  recent  availability  of  additional 
Census  Bureau  commuting  flows  data  and  Internal  Revenue 
Service  (IRS)  county  tabulations  of  selected  income  items  has 
enabled  BEA  to  make  estimates  of  personal  income  for  all 
counties. 

Because  of  the  availability  of  relatively  more  accurate  State 
control  totals,  a  different  allocator  may  be  used  in  each  State 
without  impairing  the  interstate  comparability  of  the  estimates. 
This  use  of  individual  State  totals  makes  it  possible  to 
incorporate  the  best  county  estimating  series  in  each  State  thus 
increasing  the  accuracy  of  the  sub-State  estimates.  Because  most 
components  of  personal  income  can  be  estimated  more  reliably 
for  States  than  for  smaller  geographic  areas,  allocation  permits 
the  use  of  a  wide  range  of  related  series  of  data  which,  while 
approximating,  may  not  precisely  "match"  the  basic  State  series 
to  be  allocated. 

The  major  alternative  to  BEA's  present  approach  to  income 
measurement  would  be  to  collect  the  necessary  information  in 
surveys  of  income  recipients.  Ideally,  the  survey  approach 
would  provide  data  directly  suited  for  the  measurement  of 
personal  income,  eliminating  the  necessity  of  adjusting  for 
definitional  and  conceptual  differences  among  the  various 
inputs.  Unfortunately,  the  cost  associated  with  this  alternative 
would  be  prohibitive  because  of  the  sample  size  necessary  to 
permit  reliable  local  area  estimates.  In  addition,  it  is  well  known 
that  response  errors  arc  high,  particularly  for  property  and 
transfer  income  components.  The  use  of  administrative  records 


"A  supplement  to  the  Survey  of  Current  Business  entitled  Personal 
Income  by  Stales  Since  1929  contains  definitions,  sources  of  data,  and 
rni'lliodsol  estimation  used  to  construct  the  State  personal  income  series. 
Although,  out  of  print,  copies  of  this  volume  are  av.iilablc  at  the  libraries 
and  Bureau  of  Business  Research  in  most  colleges  and  universities 
Ihrougliout  the  country.  An  updated  State  methodology  is  being 
developed  as  part  of  the  Regional  I  conomic  Measurement  Division's 
benchmark  revisions. 


Coleman 


31 


is  both  reliable  as  well  as  economical  insofar  as  the  data  are 
subject  to  internal  review  by  the  agency  administering  the 
program.  Also,  the  costs  tor  using  data  collected  by  other 
agencies  for  administrative  purposes  are  minimal  compared  with 
that  required  for  regional  surveys.  A  brief  examination  of  the 
ES-202  report  (a  report  required  for  establishments  covered  by 
the  Ul  program),  one  of  the  more  important  administrative 
record  files  used  bv  BEA,  will  illustrate  the  value  and 
importance  of  these  records  as  a  source  of  data  for  local  area 
income  measurement. 

The  ES-202  unemployment  insurance  (Ul)  tax  reports  of  the 
Nation's  employers  to  their  respective  State  Employment 
Security  Agencies  are  used  to  estimate  four-fifths  of  total  wage 
and  salary  disbursements,  which  account  for  about  two-thirds  of 
all  personal  income.  The  tapes  and  tabulations  of  quarterly 
wages  and  monthly  employment  which  BEA  obtains  from  each 
State  Employment  Security  Agency  through  the  Bureau  of 
Labor  Statistics  (BLS)  are  summarized  by  industry  and  county. 

The  usefulness  of  the  ES-202  file  has  been  somewhat  limited 
in  the  past  because  of  its  exclusion  of  "small  firms."  Until  1972, 
more  than  half  of  the  States  excluded  establishments  with  less 
than  four  employees  from  mandatory  coverage.  Beginning  with 
the  first  quarter  of  1972,  however,  the  Ul  coverage  of  payrolls 
and  employment  was  extended  to  cover  most  firms  in  all  but  a 
few  industries.  The  small  firm  exclusions  were  eliminated 
entirely  with  the  sole  exception  of  nonprofit  organizations,  for 
which  mandatory  coverage  extends  only  to  those  with  four  or 
more  employees.  Other  than  this,  firms  of  all  sizes  are  now 
included  under  the  provisions  of  the  Federal  Unemployment 
Tax  Act.  It  should  be  noted  that  the  ES-202  employer  reports 
are  not  fully  comprehensive.  Payrolls  of  several  industries  still 
remain  outside  the  scope  of  the  State  unemployment  insurance 
laws.  These  excluded  elements  consist  of  wages  and  salaries  of 
the  Federal  Reserve  Board,  national  banks,  State  banks  that  are 
members  of  the  Federal  Reserve  System  in  New  Jersey  (prior  to 
1972),  electric  railways,  carrier  affiliates  in  the  transportation 
industry,  insurance  solicitors  working  on  commission,  and 
employees'  tips.  In  some  instances,  payrolls  of  these  industrial 
segments  can  be  estimated  quite  readily  by  county.  In  others, 
the  task  is  difficult  and  the  results  less  satisfactory. 

The  1972  increase  in  unemployment  insurance  coverage  has 
substantially  enriched  this  basic  data  source,  and  the  1978 
coverage  change  will  further  enhance  its  value.  At  that  time,  97 
percent  of  all  wage  and  salary  workers  will  be  covered.  Still 
excluded  will  be  the  religious  sector,  domestic  services  (except 
for  those  with  wages  of  $1,000  per  quarter),  agents  on 
commission,  casual  labor,  student  nurses  and  interns,  students 
working  for  schools,  domestic  service  in  colleges,  and  clubs  or 
fraternities.  Nonprofit  institutions  with  less  than  four  em- 
ployees will  be  excluded,  although  they  may  elect  to  be  covered 
on  a  voluntary  basis.  In  the  farm  sector,  farms  with  ten 
employees  or  more  or  with  wages  of  $20,000  in  a  calendar 
quarter  will  be  covered.  Probably  the  most  significant  addition 
under  the  1978  coverage  changes  will  be  the  inclusion  of  all 
State  and  local  government  employees.  Today,  the  ES-202 
based  wage  estimates  are  more  complete  and  reliable  than  those 
from  any  other  source,  with  the  possible  exception  of  those 


transfer  payments  based  on  the  records  of  government  disburse- 
ments to  individuals.  In  addition,  because  of  their  sizeable 
weight  in  the  total  income  flows,  the  wage  estimates  based  on 
the  employer  reports  impart  a  large  measure  of  reliability  to  the 
estimates  of  aggregate  income. 

Uses  of  Personal  Income 

In  addition  to  its  use  in  fund  allocation  formulas,  as  noted 
earlier,  the  regional  personal  income  estimates  serve  in  a  wide 
range  of  other  uses. 

The  bureau  depends  heavily  on  the  trends  reflected  in  the 
county  wage  and  employment  estimates  in  its  preparation  of 
long  term  subnational  projections  of  population,  income,  and 
employment.  The  BEA  regional  projections  reflect  historical 
growth  trajectories  which,  with  the  specification  of  the  appro- 
priate coefficients,  may  be  taken  to  represent  (1)  the  demand 
for  additional  public  and  private  investment;  and  (2)  they  may 
represent  the  best  estimates  of  the  distribution  of  population 
and  industry  activity  in  the  absence  of  programs  designed  to 
alter  the  growth  trajectories  in  line  with  other  more  desired 
patterns.  Subarea  projections  are  maintained  for  173  economic 
areas  as  well  as  for  a  wide  range  of  functional  county 
aggregations.  These  economic  area  projections  have  also  been 
disaggregated  and  reassembled  for  States  and  for  each  of  261 
metropolitan  areas  as  well  as  for  the  residual  non-SMSA 
portions  of  each  of  the  1  73  areas. 

The  county  wages  and  employment  estimates  are  also  an 
essential  input  to  the  system  which  the  Bureau  of  Economic 
Analysis  is  developing  for  measuring  the  impact  and  providing 
for  the  evaluation  of  public  programs.  This  impact  evaluation 
system  traces  the  economic  population  redistributions  over 
regions,  and  measures  the  potential  for  utilizing  otherwise 
unemployed  labor.  It  will  include  a  regionally  specific  industry 
displacement  model  that  traces  the  redistributions  of  industrial 
activity  attributable  to  a  public  or  large  scale  private  project;  it 
will  also  provide  an  interindustry  multiplier  model  that  assesses 
the  secondary  effects  of  the  project  on  the  suppliers  of 
intermediate  goods  and  services  to  the  affected  industries,  and 
on  the  suppliers  of  consumer  goods  and  service  to  the  affected 
regional  populations.  Industry  location  threshold  models  will 
identify  any  further  induced  redistributions  of  economic 
activity  stemming  from  changes  in  the  sizes  and  industrial 
compositions  of  areas;  a  demographic  model  will  balance 
changes  in  labor  force  and  population  with  employment 
opportunities  via  migration;  and,  in  addition,  a  labor  utilization 
model  will  compare  differences  between  labor  market  con- 
ditions in  all  affected  areas  with  and  without  the  proposed 
project  and  estimate  the  effects  on  national  and  regional  net 
income  and  employment. 

Federal,  State,  and  local  governments  use  the  local  area 
personal  income  statistics  to  analyze  economic  conditions  in 
various  areas,  serve  as  a  basis  for  allocating  funds,  monitor  the 
effectiveness  of  government  programs,  estimate  in  advance  of 
adoption  the  regional  effects  of  alternative  development  pro- 
grams, measure  the  capacity  of  local  areas  to  provide  tax 
revenues,  and  gauge  differences  in  welfare.  The  National  School 
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Lunch  Act  and  the  General  Revenue  Sharing  Act  are  only  two 
examples  of  government  programs  that  use  BEA's  estimates  of 
personal  income  to  allocate  funds.  Business  uses  BEA's  esti- 
mat'  -  as  a  direct  measure  of  consumer  markets  by  geographic 
area  and  as  an  indirect  measure  of  regional  industrial  markets. 
Universities  and  research  organizations  use  these  same  estimates 
to  identify  and  measure  the  factors  responsible  for  area 
differences  in  levels  of  income  and  rates  of  economic  growth. 

Adequacy  of  Personal  Income  as  a  Subnational  Income 

Measure 

Since  adequacy  is  a  relative  term,  it  is  assumed  for  the 
purp  ses  of  this  paper,  that  the  concept  of  personal  income  as 
defined  on  pages  1  and  2  is  an  adequate  measure  of  income 
received  by  individuals.  The  following  comment  on  the  adc- 
qii^L\'  of  personal  income  will,  therefore,  confine  itself  to  its 
si  deal  adequacy  in  terms  of  its  currency,  content,  and 
accuracy. 

In  the  area  of  currency,  BEA's  subnational  personal  income 
Citiniates  are  available  from  4  months  to  15  months  after  the 
close  of  the  reference  quarter  or  year. 

Quarterly  State  personal  income  estimates  are  available  4 
mori'is  after  the  close  of  the  reference  quarter.  Preliminary 
a.inijdi  State  personal  income  estimates  are  also  available  4 
months  after  the  close  of  the  calendar  year,  followed  by  the 
release  of  revised  and  more  detailed  State  estimates  8  months 
after  the  close  of  the  calendar  year  in  question.  A  4-month  lag 
in  the  release  of  State  quarterly  personal  income  figures  yields  a 
product  that  is  timely  enough  for  most  purposes  and  the 
8-month  lag  in  the  release  of  the  more  detailed  annual  State 
personal  income  estimates  causes  few  hardships  for  the  users  of 
this  series. 

The  much  finer  geographic  breakdowns  of  personal  income 
(SMSA's  and  countries)  are  available  15  months  after  the  close 
of  'i'e  calendar  year.  The  1  5-month  lag,  required  to  produce  the 
SMSA  and  county  personal  income  estimates,  is  considered  to 
be  too  long  and  has  limited  the  usefulness  of  the  subnational 
personal  income  estimates  both  in  terms  of  current  program 
apr'xation  and  regional  analysis. 

1  he  problem  of  the  currency,  of  the  subnational  estimates  of 
personal  income,  fortunately,  is  not  an  intractable  one.  The 
solution  is  to  develop  quarterly  SMSA  and  county  estimates  of 
personal  income  similar  to  those  currently  produced  at  the  State 
level. 

Another  important  test  of  the  adequacy  of  the  subnational 
personal  income  scries  is  the  content  or  amount  and  quality  of 
information  generated  by  the  scries.  As  noted  earlier,  in  the  case 
of  SMSA  and  county  personal  income,  the  amourit  and  type  of 
underlying  detail  produced  by  the  estimating  process  is  suffi- 
cient to  serve  a  very  broad  range  of  applications.  At  the  same 
time  this  extensive  array  of  information  allows  for  more 
detailed  cross-checks  and  edits. 

The  criteria  in  judging  the  statistical  adequacy  of  these 
regional  measures  is  reliability  and  accuracy.  The  sharp  rise  of 
interest    in    the   reliability    of   regional    or  subnalional    macro- 


economic  data  stems  from  the  dramatic  expansion  in  the  use  of 
local  area  data  for  programmatic  purposes.  Because  these  broad 
economic  or  demographic  measures  are  finding  their  way  into 
formulas  used  to  distribute  billions  of  dollars  of  Federal  and 
State  funds  to  the  nation's  political  subdivisions,  attention  is 
being  focused  with  increasing  frequency  on  the  quality  of  such 
local  area  estimates  as  personal  income,  employment,  and 
population.  The  problem  of  judging  the  reliability  of  such  series 
is  compounded  because  the  use  of  comparative  analysis  is  limited, 
since  the  series  in  question  is  the  sole  source  of  such 
information.  The  method  of  evaluation  must,  therefore,  rest  on 
an  examination  of  the  sources  and  methods  used  in  compiling  or 
estimating  the  statistics.  If  the  estimates  are  a  product  of  a 
sample  survey,  one  judges  its  reliability  on  the  basis  of  its 
conformity  to  current  sampling  theory  and  practice  and  the 
implications  of  nonsampling  errors.  In  the  case  of  BEA's 
personal  income,  the  local  area  statistics  are  primarily  the 
byproduct  of  information  generated  from  administrative 
records.  BEA  supplements  these  basic  statistics,  which  may  be 
presumed  to  be  "reliable,"  with  data  of  lesser  quality,  scope, 
and  relevance.  In  order  to  adjust  for  the  gaps  and  deficiencies  in 
the  poorer  quality  data,  indirect  procedures  and  arbitrary 
assumptions  are  sometimes  required.  The  use  of  such  indirect 
procedures  has  varied  considerably  over  the  past  half  century. 
Moreover,  the  impact  of  these  procedures  on  the  individual 
State  and  county  estimates  varies  in  accordance  with  regional 
differences  in  industrial  structure.  Furthermore,  adjustments  are 
continually  made  for  changes  in  reporting  procedures  such  as 
those  resulting  from  changes  in  industry  and  geographic  codes, 
coverage,  SIC  code  conversions,  coding  errors,  shifts  in  Federal 
and  State  processing,  and  program  priorities. 

To  provide  measures  of  the  error  introduced  into  the  income 
estimates  resulting  from  the  use  of  administrative  records  and 
survey  material,  or  of  errors  arising  from  the  methodology  or 
errors  caused  by  the  use  of  indirect  procedures  or  the  normal 
"statistical  housekeeping  chores"  would  be  extremely  difficult. 

The  adequacy  of  the  BEA  subnational  income  scries  is  best 
evaluated  by  a  review  of  the  methodology  and  a  determination 
as  to  whether  or  not  all  the  relevant  information  collected  by 
another  agency  for  another  purpose  has  been  obtained  and 
correctly  used  by  BEA  in  estimating  the  elements  of  its  income 
series.' 

Any  local  area  measurement  cfforl  has  limitations  and 
problems  and  the  BEA  local  area  personal  income  program  is  no 
exception.  However,  most  of  these  problems  are  held  to 
manageable  levels  due,  in  large  part,  to  the  existence  of  the 
various  Federal/State  cooperative  statistical  systems  generating 
local  area  data  and  by  the  close  cooperation  extended  to  BEA 
by  a  large  array  of  statistical  and  program  agencies  at  both  the 
national  and   Stale    levels.  This  cooperation  produces  the  most 

'  I  he  reader  can  draw  liis  own  conclusions  as  to  llie  adcciuacy  of 
[3KA's  efforts  from  a  review  of  the  sources  and  metliods  used  to  prepare 
the  estimates.  This  methodoloKV  is  published  in  a  special  supplement  to 
the  Survey  of  Current  Business  entitled  "Local  Area  Personal  Income 
1969-1974."  (his  supplement  can  be  purchased  from  the  National 
lechnical  Information  Service,  5285  Port  Royal  Road,  Springfied, 
Virginia  22161.  The  mcthodoloKV  is  published  in  Volume  1,  Accession 
Number  PI325'1055.  Price:  $10  paperback,  $2.25  microliche. 
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comprehensive    annual    measure    of    income    available    at    the  development    of    an    economic    series    v/hich,    because   of   its 

subnationallcvel,  providing  a  record  of  economic  developments  extensive    underlying   detail    has   a   diagnostic   capacity.  This 

for  States,  counties,  and  SMSA's  that  span  a  half  century.  It  has  capacity  can  be  used  to  support  and  explain  the  annual  changes 

produced    a    personal    income   scries   that    is   consistent   both  in  the  geographic  distribution  of  income  among  the  residents  of 

geographically   and    industrially.    Perhaps,   it   has   enabled   the  the  Nation's  major  political  subdivisions. 
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Linking  CPS  Unemployment  Estimates 
With  Administrative  Records  On 
Unemployment 

Martin  Ziegler 

Bureau  of  Labor  Statistics 

INTRODUCTION 

About  16  billion  dollars  will  be  spent  in  fiscal  year  (FY) 
1977  on  various  Federal  assistance  programs  designed  to  help 
States  and  local  communities.  The  basis  for  these  allocations 
will  be  the  local  area  unemployment  estimates  developed  by 
State  Employment  Security  Agencies  under  the  auspices  of  the 
Bureau  of  Labor  Statistics.  Unfortunately,  the  quality  of  the 
estimates  for  many  of  the  areas  is  poor.  In  an  effort  to  improve 
the  estimates  the  Bureau  of  Labor  Statistics  has  introduced  a 
number  of  changes  in  the  estimating  techniques,  including  the 
linking  of  the  unemployment  insurance  (Ul)  data  with  the 
current  population  survey(CPS). 

This  paper  discusses  the  methods  used  to  develop  the 
unemployment  estimates  and  efforts  to  improve  the  Ul  data  and 
the  estimating  procedures,  in  the  first  section,  there  is  a 
description  of  the  Ul  data  base,  the  starting  point  for  the 
monthly  estimates  of  unemployment.  The  second  section 
describes  the  procedure  for  inflating  the  Ul  data  (the  70-step 
procedure)  to  derive  a  first  stage  estimate  of  total  unemploy- 
ment. The  third  section  discusses  the  benchmarking  procedure, 
the  process  by  which  the  CPS  is  linked  to  each  of  the  States, 
including  the  interpolation  and  extrapolation  procedures.  The 
fourth  section  describes  planned  improvements  in  the  pro- 
cedures, including  the  expansion  of  the  CPS  sample,  upgrading 
the  Ul  data  base  and  experiments  with  other  estimating  models. 


THE  UNEMPLOYMENT  INSURANCE 
DATA  BASE 

Monthly  estimates  of  total  unemployment  for  States  and 
areas  arc  derived  by  inflating  data  from  the  administrative 
records  of  Stale  Ul  systems  and  linking  these  estimates  to 
annual  measures  obtained  from  the  CPS.  Under  Federal  law. 
States  arc  free  to  develop  their  own  Ul  program  in  accordance 
with  their  own  social  philosophy  subject  to  minimum  standards 
established  by  the  Federal  Government.  Statistical  data  ob- 
t^iincd  from  Ul  records  are  a  byproduct  of  the  operations  of  the 
various  State  programs  and  reflect  differences  in  the  legal 
definition  of  unemployment  as  well  as  differences  in  adminis- 
trative inlerprelation  of  the  Ul  laws. 

In  order  to  use  Ul  daLi  for  deriving  total  unemployment 
estimates,  for  States  consistent  with  the  concepts  followed  to 
mcdsurc  uncmplf>yment  for  the  Nation,  it  is  necessary  to 
standardize  the  Ul  data  for  differences  in  legal  definitions  and  in 


the  procedures  used  for  collecting  and  tabulating  the  data.  With 
respect  to  the  legal  definitions  of  unemployment,  five  major 
factors  need  to  be  considered:  (1)  Coverage,  (2)  eligibility,  (3) 
disqualification  provisions,  (4)  benefit  duration,  and  (5)  forgive- 
ness of  earnings. 

To  faciliate  discussion  of  these  factors,  I  have  developed  data 
which  compares  statistics  from  the  Ul  systems  of  27  large  States 
in  1974,  see  table  1.  This  year  was  chosen  in  order  to  avoid 
distortion  in  the  data  which  occurred  in  subsequent  years  as  a 
result  of  the  enactment  of  temporary  programs  to  extend 
coverage  and  increase  the  duration  of  benefits.  The  States 
selected  for  comparison  were  the  27  CPS  States  in  1974. 

Coverage 

Prior  to  the  enactment  of  the  Unemployment  Compensation 
Amendments  of  1976,  States  were  required  to  cover  all  but  the 
following  types  of  employment:  (a)  Agricultural  labor;  (b) 
domestic  service;  (c)  self-employed  and  unpaid  family  workers; 
(d)  certain  types  of  nonprofit  groups;  (e)  State  and  local 
government  employees,  except  in  hospitals  and  institutions  of 
higher  learning.  However,  a  number  of  States  elected  to  broaden 
the  coverage  of  the  Ul  law  to  include  one  or  more  of  the  types 
of  employment. 

Table  1  shows  the  percent  of  nonagricultural  wage  and  salary 
employment  covered  by  Ul  for  the  Nation  and  the  27  States 
prior  to  the  recent  extension  of  coverage.  For  the  Nation  as  a 
whole,  87  percent  of  nonfarm  employment  was  covered  by  Ul. 
Among  the  27  States  the  variation  in  coverage  is  surprisingly 
small,  ranging  from  84  percent  in  Kentucky  to  98  percent  in 
Connecticut.  Because  these  large  States  are  predominantly 
industrial  the  lack  of  Ul  coverage  in  rural  farm  States  is  not 
adequately  portrayed. 

Effective  January  1978,  coverage  will  be  extended  to  (1) 
agricultural  workers  employed  on  large  farms,  (2)  all  State  and 
local  government  employees,  (3)  private  nonprofit  school 
employees,  and  (4)  domestic  workers.  Approximately  97 
percent  of  the  wage  and  salary  employment  in  the  United  States 
will  be  covered  by  Ul. 

Based  on  the  data  shown  in  table  1,  it  does  not  appear  that 
differences  in  coverage  would  have  a  major  affect  on  the  Ul  data 
used  for  estimating  total  unemployment.  This  is  the  case  when 
extension  is  completed  in  1978.  However,  the  lack  of  coverage 
of  small  farms  and  the  self-employed  may  cause  problems  in 
estimating  unemployment  in  rural  areas  of  the  Nation. 

Eligibility 

Although  a  worker  may  be  employed  in  the  industry  covered 
by  Ul,  he  will  not  be  eligible  for  benefits  unless  he  meets 
specific  tests  of  attachment  to  the  labor  force  as  reflected  in  his 
prior  earnings  and  work  experience.  Eligibility  requirements  for 
benefits  are  usually  designed  to  confirm  that  a  worker  has 
substantial  attachment  to  the  labor  force.  A  claimant  must  have 
earned  a  minimum  amount  of  wages  and/or  worked  a  minimum 
number  of  weeks  during  a  four  or  five  quarter  period  immcdi- 
alely  prececding  his  separation  from  employment. 


Ziegler 


35 


Table  1.  Significant  Unemployment  Insurance  Factors  for  27  Lrrge  States:   1974 


State 


Percent 
covered* 


Ineligi- 
bility 
rate^ 


Di. qualify 


rate- 


Duration 

of 
benefits* 


Percent  of 

weekly 

benefit 

forgiven^ 


United  States. 


Alabama 

California  . 
Connecticut. 

Florida 

Georgia 

Illinois. . . . 

Indiana 

Kentucky. . . . 
Louisiana. . . 


Maryland 

Massachusetts. . 

Michigan 

Minnesota 

Missouri 

New  Jersey 

New  York 

North  Carolina. 
Ohio 


Oklahoma 

Oregon 

Pennsylvania. . . 
South  Carolina. 

Tennessee 

Texas 

Virginia 

Washington 

Wisconsin 


87 

85 
88 
98 
87 
86 
89 
86 
84 
85 

86 
87 
86 
85 
86 
86 
87 
87 
88 

88 
86 
90 
85 
86 
87 
84 
86 
88 


13,1 


21.0 
14.6 
20.5 
18.1 


12. 
18. 
12. 
13. 


20.5 


26.0 

17.1 
36.7 
22.1 
33.6 
28.2 
24.6 
30.9 
18.3 
17.9 


21 

5 

6 

9 
21 

4 

5 
12.3 

6.2 


38.2 
14.3 
41.8 
71.8 
6.9 
25.1 
24.7 
16.1 
27.5 


27.0 
18.1 
14.0 
17.6 
16.8 
20.0 
16.1 
25.8 
5.7 


61.8 
28.6 
18.7 
26.6 
9.8 
28.6 
28.4 
22.4 
17.0 


24.4 

23.8 
23.8 
25.9 
20.6 
20.7 
23.8 
20.7 
23.2 
24.2 


26.0 
26.8 


23 

23 

23 

23 

26.0 

25.2 

25.7 

22.3 
25.8 
30.0 
24.0 
23.8 
21.5 
23.2 
25.2 
27.3 


(NA) 

11.6 
27.9 
55.0 
7.4 
14.3 
10.5 
20.0 
40.0 
50.0 

15.6 
14.4 
50.0 
38.0 
17.7 
20.0 
0.0 
50.0 
20.0 

14.5 
33.3 
40.0 
25.0 
19.7 
25.0 
33.3 
25.0 
50.0 


NA  Not  available. 

Estimated  percent  of  non-agriculturial  wage  and  salary  employment  covered  by  UI. 
^Percent  of  claimants  declared  Ineligible  because  of  insufficient  wages  or  work  experience  in  covered  em- 
ployment. 

'Total  disqualification  due  to  all  issues  except  labor  disputes  per  1,000  claimants  contacts. 
''Potential  average  duration  of  benefits  weeks)  for  insured  claimants. 
'The  proportion  of  the  average  weekly  payment  disregarded  for  UI  purposes. 

Source:   U.S.  Department  of  Labor,  Unemployment  Insurance  State  Laws  and  Experience.  U.S.  Department 
of  Labor,  Unemployment  Insurance,  tables  8  and  13. 


Legal  provision  of  the  various  UI  laws  do  not  provide  an 
adequate  quantitative  measure  to  compare  among  the  States. 
However,  by  comparing  the  number  of  ineligible  claimants 
denied  benefits  because  of  insufficient  wages  or  work  experience 
related  to  the  total  number  of  claimants  for  which  eligibility 
determinations  were  made,  one  can  develop  an  ineligibility  rate, 
see  table  1. 

For  the  United  States  the  average  rate  of  ineligibility  was 
13.1  percent  in  1974.  The  lowest  rate  was  recorded  by  New 
Jersey  with  4.7  percent,  the  highest  rate  was  recorded  for 
Oklahoma  with  27.0  percent.  In  general  the  large  industrial 
States  in  the  Northeast  and  along  the  Great  Lakes  had  lower 
than  average  ineligibility  rates.  The  sun-belt  States  in  the  South 
and  West  had  higher  than  average  ineligibility  rates. 


Almost  1.5  million  persons  were  denied  benefits  because  of 
insufficient  earnings  in  1974.  Because  this  is  such  a  sizable 
number  and  the  variation  in  eligibility  among  the  States  is  also 
quite  large,  any  formula  which  uses  UI  data  for  developing  total 
unemployment  estimates  should  take  State  differences  in  this 
factor  into  account.  The  absence  of  this  factor  may  be  one  of 
the  principal  reasons  why  the  70-step  procedures  used  to 
estimate  unemployment  may  be  consistently  in  error. 

Disqualification  Provisions 

Claimants  who  qualify  for  benefits  because  of  previous  work 
experience  may  still  be  disqualified  if  the  reasons  for  separation 
from  employment  ^re  not  adequate  or  they  fail  to  meet  the 
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tests  of  ability  and  availability  for  work.  Voluntary  quits, 
discharge  for  misconduct,  and  engagement  in  labor  disputes  are 
usually  grounds  for  disqualification  in  most  States.  However,  the 
length  of  period  of  disqualification  will  vary  from  State  to 
State.  In  all  but  a  few  States,  disqualified  claimants  are  not 
required  to  report  to  U!  offices  and  are  not  counted  among  the 
insured  unemployed. 

To  measure  the  differences  in  disqualification  among  the 
States,  one  should  compare  the  disqualification  rates  as  shown 
in  table  1.  This  rate  is  calculated  by  dividing  the  total  number  of 
disqualifications  for  all  reasons,  except  labor  disputes  by  the 
number  of  times  claimants  contact  the  Ul  offices  to  claim 
benefits. 

The  disqualification  rate  for  the  United  States  was  26.0  per 
thousand  in  1974.  The  lowest  rate  was  6.9  per  thousand  in 
Missouri  and  the  highest  rate  was  71.8  per  thousand  in 
Minnesota.  Of  the  3  million  disqualifications  reported  in  1974, 
about  one-third  were  due  to  voluntary  quits.  In  more  than  half 
the  States  the  claimants  who  quit  their  jobs  are  denied  benefits 
for  the  duration  of  their  unemployment.  In  the  other  half  the 
penalty  period  varies  from  a  few  weeks  to  many  weeks.  While 
the  disqualification  rates,  may  be  indicative  of  the  differences 
in  the  State  laws  and  interpretation  of  these  laws,  they  do  not 
take  into  account  the  severity  of  the  penalty,  merely  the 
incidence  of  disqualification. 

Duration  of  Benefits 

Benefit  duration  is  based  on  the  amount  of  earnings  in  the 
base  period  in  all  but  eight  States.  The  duration  period  varies 
from  6  weeks  to  34  weeks.  Benefit  duration  ranges  from  26 
weeks  to  30  weeks  in  the  8  uniform  duration  States. 

To  determine  the  statutory  effects  on  potential  benefit 
duration,  estimates  have  been  made  for  each  of  the  27  States 
based  on  average  earnings  reported  in  1974.  In  the  United  States 
the  average  potential  duration  period  was  24.4  weeks  in  1974. 
Florida  had  the  shortest  average  potential  duration  of  20.6 
weeks,  while  the  longest  duration  of  30.0  weeks  was  reported 
by  Pennsylvania.  Because  of  the  variation  in  duration  of 
benefits,  claimants  will  exhaust  benefits  more  frequently  in 
Florida  than  Pennsylvania.  However,  in  addition  to  the  State 
laws,  the  economies  of  the  different  States  will  also  affect 
benefit  exhaustions.  Needless  to  say,  claimants  who  exhaust  Ul 
benefits  are  not  counted  among  the  insured  unemployed. 

The  duration  period  of  benefits  was  temporarily  extended  in 
1975  and  1976,  as  the  result  of  enactment  of  the  Federal 
Supplemental  Benefits  (FSB)  program.  Claimants  may  collect 
up  to  a  maximum  of  6,5  weeks  of  benefits  or  2'/^  times  th 
amount  of  benefits  normally  earned.  Persons  in  cxtcndec.' 
benefit  status  are  not  included  in  the  regular  count  of  insured 
unemployment  but  are  in  a  separate  count  of  their  own. 

Forgiveness  of  Earnings 

In  the  CPS,  any  work  lor  pay  or  profit  would  automatically 
place    the   respondent   in    the   employed    category.   In   the   Ul 


system,  claimants  are  allowed  to  earn  a  certain  percentage  of 
their  weekly  benefit  without  being  penalized,  and  are  still 
counted  as  totally  unemployed. 

The  amount  of  earnings  forgiven  or  disregarded  is  specified 
by  law  as  either  a  dollar  amount  or  some  fraction  of  the  weekly 
benefit  amount.  For  purposes  of  comparison,  I  have  calculated 
the  forgiveness  amount  for  each  of  the  States  as  a  percent  of  the 
weekly  benefit  amount  as  shown  in  table  1 . 

The  percentage  forgiven  ranges  from  7  percent  to  55  percent. 
In  dollar  amounts  it  is  $5  to  $46  depending  on  the  States  (only 
New  York  State  has  no  forgiveness  level).  Regardless  of  whether 
the  worker  earns  $46  or  less,  he  is  still  counted  as  unemployed 
and  included  among  the  insured  unemployed. 

Collection  and  Tabulation  of  Ul  Data 

Ul  statistics  are  usually  compiled  in  local  offices  from 
manual  counts  of  documents  or  tally  strokes.  The  collection 
and  tabulation  procedures  followed  by  most  States  are  in 
variance  with  established  CPS  concepts  in  three  areas:  (1)  the 
reference  period  of  unemployment;  (2)  the  geographic  frame  of 
reference,  i.e.,  place  of  work,  place  of  residence,  or  place  of 
filing;  and  (3)  the  definition  of  total  unemployment. 

The  reference  period  is  not  exact  when  local  offices  base 
their  statistics  on  workload  counts.  States  used  to  count  the 
number  of  claims  processed  the  week  following  the  week  of 
unemployment,  on  the  assumption  that  most  claimants  file  for 
benefits  as  soon  as  they  complete  their  week  of  unemployment. 
However,  because  of  heavy  claims  activity,  most  States  now 
require  claimants  to  report  biweekly.  By  counting  the  number 
of  weeks  claimed,  an  assumption  is  made  that  one  of  the  two 
weeks  claimed  by  Mr.  X  represents  a  week  that  will  be  claimed 
by  Mr.  Y  when  Mr.  Y  reports  the  following  week.  This 
assumption  may  not  be  valid. 

The  geographic  frames  of  reference  for  unemployment 
should  be  the  county  or  city  of  residence.  However,  until 
recently  most  States  counted  claims  at  the  location  of  the  Ul 
office  and  did  not  identify  the  claimant's  residence.  Where  two 
border  States  were  involved  and  one  coded  by  residence  and  the 
other  did  not,  commuter  claimants  were  often  lost  in  the 
insured  unemployment  count. 

The  existence  of  a  forgiveness  level  in  most  States,  creates  a 
conflict  between  the  legal  definition  of  unemployment  and  the 
economic  definition.  All  things  being  equal,  the  count  of 
insured  unemployed  is  inflated  to  the  extent  that  claimants  with 
any  earnings  are  included.  The  problem  is  exacerbated  when  the 
level  of  forgiveness  varies  from  Stale  to  State. 

THE  HANDBOOK  METHOD 
description 

The  need  for  a  consistent  and  uniform  method  of  estimating 
lotal  unemployment  for  States  and  areas  led  to  the  development 
of  a  70-step  formula  (the  Handbook  Method)  by  the  Depart- 
ment of  Labor  in  I960.  Total  unemployment  data  on  a  current 
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basis  for  geographic  areas  were  not  available  from  existing 
sources.  The  only  data  available  were  the  Ul  data  and  they  did 
not  provide  complete  coverage  of  the  unemployed  population 
nor  were  they  comparable  between  States  because  of  differences 
in  the  legal  definition  of  unemployment. 

The  70-sicp  formula  was  designed  to  reproduce  an  estimate 
of  total  unemployment  which  would  have  emerged  if  a  CPS 
type  household  survey  were  conducted  in  an  area.  The  UI  data 
were  of  primary  importance  in  the  existing  system  because  they 
provide  the  necessary  geographic  data  on  a  current  basis. 

Total  unemployment  in  an  area  can  be  divided  into  two 
major  categories.  First,  the  insured  unemployed  is  an  actual 
count  of  claims  taken  from  persons  certifying  to  a  week  of 
unemployment.  Second,  is  an  unknown  quantity  but  represents 
those  individuals  classified  according  to  the  reason  they  are  not 
counted  among  the  insured  unemployed  e.g.,  (1)  worked  in  a 
noncovered  industry,  (2)  denied  benefits  because  of  monetary 
ineligibility,  (3)  disqualified  because  of  quitting  a  job,  etc.,  (4) 
exhausted  their  benefit  entitlement,  or  (5)  newly  entered  or 
reentered  the  labor  force  after  a  long  absence  from  work.  The 
problem  of  estimating  total  unemployment  can  then  be  reduced 
to  estimating  those  components  which  fall  into  the  second 
category. 

The  starting  point  for  the  development  of  the  methodology 
was  to  obtain  special  tabulations  of  the  unemployed  from  the 
CPS  which  could  fit  these  categories.  This  could  not  be  done, 
since  CPS  respondents  are  never  asked  about  their  Ul  status  on 
the  regular  monthly  survey.  As  a  compromise,  data  were 
tabulated  in  terms  of  presumed  Ul  coverage  by  industry  and 
class  of  worker.  Three  categories  emerged:  (1)  unemployed 
persons  last  working  in  industries  presumed  covered  by  Ul,  (2) 
unemployed  last  working  in  noncovered  industries,  and  (3) 
unemployed  entrants.  Total  unemployment  is  the  sum  of  the 
estimates  for  the  three  categories. 

In  the  Handbook,  the  covered  unemployment  category 
included  a  count  of  the  insured  unemployed  and  estimates 
needed  to  standardize  for  differences  in  States  Ul  laws  affecting 
benefit  duration  and  non-monetary  disqualifications  (quits, 
discharge,  etc.).  Estimates  of  unemployed  persons  who  ex- 
hausted Ul  benefits  were  derived  from  followup  studies  of  Ul 
claimants  conducted  by  the  State  Employment  Security 
Agencies.  The  post  exhaustion  experience  of  this  group  was 
linked  to  the  labor  force  experience  of  claimants  to  obtain 
factors  needed  to  adjust  for  differences  in  benefit  duration.  It 
was  assumed  that  the  same  factors  could  be  used  to  estimate  the 
experience  of  those  disqualified  from  benefits  for  non-monetary 
reasons.  No  specific  provision  was  made  to  estimate  the  number 
of  persons  denied  benefits  because  of  insufficient  wages.  It  was 
assumed  that  there  was  great  mobility  between  the  covered  and 
noncovered  groups.  To  the  extent  that  a  certain  number  of 
unemployed  in  noncovered  industries  may  be  counted  as 
insured  unemployed  by  virtue  of  previous  earnings,  this  would 
compensate  for  the  persons  not  counted  in  the  insured 
unemployed  because  of  insufficient  earnings.  Also  included 
were  persons  who  delayed  filing  for  benefits  and  those  who 
failed  to  file. 


First,  the  count  of  insured  unemployed  included  persons 
earning  wages  below  the  forgiveness  level  who,  for  purposes  of 
the  legal  definition  of  unemployment,  were  counted  as  eco- 
nomically unemployed  (in  the  CPS  such  persons  arc  included 
among  the  employed).  Claimants  whose  earnings  exceeded  the 
forgiveness  level  were  removed  from  the  insured  unemployed. 

Second,  the  noncovered  unemployed  was  designed  to  stand- 
ardize the  Ul  data  for  differences  in  coverage  by  industry  and 
class  of  worker.  Using  relationships  develop  from  the  un- 
published tabulations,  a  set  of  weights*  for  each  industry  and 
class  of  worker  was  developed  and  applied  to  the  insured 
unemployment  rate  to  develop  an  adjusted  insured  rate.  The 
adjusted  rate  was  then  applied  to  estimates  of  employment  for 
each  industry  to  develop  an  unemployment  estimate  for  each 
noncovered  industry. 

Third,  unemployed  entrants  and  reentrants  could  not  be 
estimated  directly  for  each  area,  because  there  was  no 
systematic  procedure  for  collecting  those  data  from  adminis- 
trative records.  This  category  was  estimated  by  developing 
synthetic  factors  relating  the  entrant  unemployed  to  the  size  of 
the  labor  force  and  the  amount  of  unemployment  in  an  area. 
Subsequent  revisions  to  the  Handbook  added  another  factor— 
the  proportion  of  youth  in  the  population.  The  estimate  of  total 
unemployed  entrants  is  a  composite  estimate  defined  as: 

U  =  A(X  +  E)  +  BX,  where 

U  =  total  entrant  unemployment 

E  =  total  employment 

X  =  total  experienced  unemployment 

A,B  =  synthetic  factors  incorporating  seasonal  variation 
and  an  assumed  relationship  between  the  propor- 
tion of  youth  in  the  working-age  population  and 
the  historical  relationship  of  entrants  to  E  and  X. 

Analysis  of  Handbook  Method 

Analysis  of  the  estimates  derived  by  the  Handbook  method 
were  conducted  by  Ullman  and  Lindauer  in  the  early  sixties.  By 
comparing  estimates  for  States  and  major  labor  market  areas 
with  estimates  from  the  1960  Census  of  Population,  they 
identified  a  regional  bias  in  the  Handbook  method.  While  several 
hypotheses  were  offered,  none  proved  to  be  conclusive  with 
respect  to  causality. 

In  1971,  Cherin  at  the  University  of  Houston  was  com- 
missioned by  the  Department  of  Labor  to  study  the  Handbook 
estimates  and  determine  whether  revisions  in  the  procedures 
could  be 'made  which  would  improve  the  estimates.  He 
compared  the  percentage  difference  between  the  Handbook 
estimates  and  the  1 970  census  estimates  for  States  and  areas  and 
explained  these  differences  through  an  error  analysis.  Additional 
work  was  done  in  comparing  the  CPS  with  the  Handbook  for 
the  20  largest  areas.  Cherin 's  findings  were  as  follows: 

1.  Although    the    Handbook    method    includes    factors 
which   take  into  account  differences  in  the  unemployment 
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insurance  program  among  the  States,  these  factors  apparently 
do  not  completely  capture  these  differences.  There  may  be 
additional  factors  affecting  the  behavior  of  unemployment 
among  areas.  Moreover,  some  may  be  unique  to  one  or  a  few 
areas  as  nonrecurring  events. 

2.  Analyses  undertaken  by  the  study  to  attempt  to 
explain  the  difference  between  the  Handbook  unemploy- 
ment estimates,  CPS,  and  Census  data  did  not  yield  results 
which  identified  factors  that  could  be  introduced  to  improve 
the  Handbook  method  (see  reference  1 ). 

In  the  Bureau  of  Labor  Statistics,  we  have  been  studying 
the  behavior  of  the  Handbook  estimates  relative  to  the  CPS 
series  at  the  State  level.  Our  research  suggests  that  the  State 
Handbook  estimates  have  performed  better  than  expected.  The 


total  estimates  have  been  belter  than  the  individual  categories 
because  the  Handbook  has  tended  to  overestimate  the  covered 
unemployed  and  underestimate  the  entrant  category.  But 
inaccuracies  in  the  components  may  not  always  cancel  them- 
selves out.  We  have  found  that  there  is  a  tendency  for  the 
Handbook  to  underestimate  during  periods  of  low  unemploy- 
ment and  overestimate  during  periods  of  high  unemployment. 
This  suggests  that  errors  in  the  Handbook  components  are  not 
offsetting  over  a  range  of  different  economic  conditions. 

Even  with  these  problems,  the  Handbook  is  less  biased  when 
used  as  a  time  series  than  when  it  is  used  for  place-to-place 
comparisons.  As  indicated  by  the  University  of  Houston  study, 
the  factors  incorporated  in  the  Handbook  procedure  have  not 
satisfactorily  standardized  the  Ul  data  for  differences  in  the 
legal  definitions  of  unemployment.  See  table  2  for  an  illustra- 
tion of  this  problem. 


Table  2.   Insured  Unemployment  and  Total  Unemployment  for  27  Large  States:   1974 


state 

Insured 

unemployed 

(000) 

Unemployed 

CPS 

(000) 

Unemployed 

HB 

(000) 

Ratio 

insured 

CPS 

Ratio 
CPS 
HB 

Ineligible 
index 

Consistency 
Check 

1 

2 

3 

4 

5 

6 

7 

United  States 

2.248.5 

26.5 
284.4 
49.1 
56.3 
32.8 
90.0 
41.9 
23.7 
29.6 

32.3 

106.7 

163.4 

37.3 

46,8 

131.1 

264.2 

37,4 

82,0 

16.7 
35.4 
152.7 
209 
32.8 
40.4 
15.3 
61.7 
38.5 

5076 

78 
669 

88 
208 
109 
224 
123 

64 

97 

84 
190 
289 
77 
95 
203 
482 
111 
225 

49 
76 

2  58 
68 
92 

220 
98 

108 
94 

4998 

^f64 

*594 

88 

*132 

100 

213 

120 

68 

94 

78 

*217 

#317 

*97 

103 

#268 

476 

if94 

212 

50 

^f66 

264 

*59 

*76 

4*186 

#61 

#130 

4*106 

44.3 

34.0 
42,5 
55,8 
27,1 
30.1 
40.2 
34.1 
37.0 
30.5 

38.5 
56.2 
56.5 
48.4 
46.8 
64.6 
54.8 
33.7 
36.4 

34.1 
46.6 
59.2 
30,7 
35.7 
18.4 
15.6 
57.1 
41.0 

1,02 

1,21 
1.13 
1,00 
1.57 
1.10 
1.05 
1.03 
.95 
1,03 

1.08 

.88 

.92 

.80 

.92 

,76 

1.01 

1,18 

1.06 

.98 
1.15 

.98 
1.15 
1.21 
1.18 
1.60 

.83 

.89 

1.00 

1.60 
1.11 
1.54 
1.37 

.98 
1.38 

.92 
1.05 
1.56 

1.67 
.38 
.47 
.73 

1.66 
.36 
.42 
.94 
.47 

2.06 
1.38 
1,07 
1.34 
1.28 
1.53 
1.23 
1.97 
.44 

Alabama 

X 

California 

X 

Connecticut 

X 

Florida 

Georgia 

X 

jliinois 

X 

Indiana 

Kentucky 

Louisiana 

X 

Maryland 

X 

Massachuse  tts 

X 

Michigan 

X 

Minnesota 

X 

Mi  ssour i 

New  Jersey 

New  York 

North  Carolina 

X 

Ohio 

Oklahoma 

Oregon 

X 

South  Carolina 

X 

Tennesse 

X 

Texas 

X 

Virginia 

X 

Washington 

Wisconsin 

X 

•significantly  different  from  CPS  at  1.6  slgma  ")0f.   probability' 
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There  is  a  wide  variation  in  the  ratios  of  insured  unemploy- 
ment to  total  unemployment  for  the  27  large  States  in  1974. 
Virginia  had  a  ratio  of  15.6  percent  of  its  unemployment 
accounted  for  by  Ul  claimants  and  New  Jersey  had  a  ratio  of 
64.6  percent.  While  some  variation  in  this  ratio  could  be 
expected  because  of  the  differences  in  the  State  economies, 
such  wide  disparity  can  only  be  a  reflection  of  the  State  Ul 
laws. 

To  what  extent  has  the  Handbook  method  succeeded  in 
adjusting  the  Ul  data?  Judging  from  the  data  in  table  2,  the 
Handbook  method  has  proven  to  be  inadequate  for  the  task. 
Fifteen  states  out  of  27  States  reported  significant  differences 
between  the  estimates  of  the  Handbook  as  compared  to  the  CPS 
at  the  90  percent  confidence  limits.  If  left  to  chance  alone,  one 
might  expect  only  three  States  to  have  significant  differences. 

It  appears  that  a  serious  omission  in  the  conceptual  frame- 
work of  the  Handbook  may  help  explain  the  reason  for  the  large 
number  of  States  with  significant  differences.  This  omission  is 
the  group  identified  as  ineligible  for  benefits  because  of 
insufficient  earnings  or  work  experience.  To  confirm  this  point, 
I  converted  the  ineligibility  rate  from  table  1  for  each  State  to 
an  ineligibility  index  (table  2).  The  ineligibility  index  is  derived 
by  dividing  each  State's  rate  by  the  average  for  the  United 
States.  The  value  of  unity  is  then  assigned  to  the  United  States. 
A  rate  in  excess  of  unity  indicates  a  more  conservative  State,  a 
rate  below  unity  indicates  a  liberal  State. 

I  then  compared  the  ineligibility  rate  with  ratios  of  the 
CPS/Handbook  on  table  2  (column  5).  A  mark  in  column  7 
indicates  that  the  ineligibility  rate  is  consistent  with  the  CPS/HB 
ratio.  An  ineligibility  rate  in  excess  of  unity,  indicating  a 
conservative  State,  would  produce  estimates  that  are  too  low 
(the  CPS/HB  ratio  exceeds  unity).  An  ineligibility  rate  below 
unity,  indicating  a  liberal  State,  would  produce  estimates  that 
are  too  high  (the  CPS/HB  ratio  falls  below  unity). 

Of  the  27  States,  17  are  marked  for  consistency.  However, 
when  only  15  States  which  reported  significant  differences 
between  the  Handbook  and  CPS  estimates  are  considered,  13 
States  proved  to  be  consistent.  It  appears  that  one  reason  the 
Handbook  method  produces  poor  results  when  compared 
cross-sectionally  with  the  CPS,  is  the  omission  from  the 
procedures  of  the  group  of  claimants  denied  benefits  because  of 
insufficient  wages  or  work  experience. 


BENCHMARKING  TO  THE  CPS 

Description 

BLS  introduced  a  new  system  for  developing  unemployment 
estimates  for  States  and  areas  in  1974.  The  system  involved 
linking  the  unemployment  estimates  derived  from  the  Hand- 
book estimates  with  annual  measures  of  unemployment  from 
the  CPS.  This  system  is  known  as  the  benchmarking  procedure 
and  is  accomplished  in  several  stages. 

The  first  stage  is  the  inflation  of  the  Ul  data  to  a  measure  of 
total  unemployment.  For  these  purposes  the  Handbook  method 


as  described  earlier,  with  a  few  modifications,  is  usci).  The 
modifications  were  made  to  reflect  changes  in  the  Ul  laws  with 
respect  to  duration  of  benefits  and  changes  in  the  seasonal 
entrant  A  and  B  factors  to  reflect  niore  up-to-date  seasonal 
trends.  Estimates  are  derived  independently  for  the  State  and 
individual  labor  market  areas  and  counties.  The  estimates  for 
the  individual  areas  are  required  to  exhaust  the  State  total.  If 
the  sum  of  the  areas  is  inconsistent  with  the  State  total  an 
additivity  adjustment  is  introduced  to  insure  linearity. 

The  second  stage  estimate  reflects  the  adjustment  of  the 
Handbook  estimates  by  a  correction  factor  derived  from  the 
previous  year's  CPS.  The  factor  is  developed  by  adjusting  the 
previous  year's  December  Handbook  estimates  by  the  annual 
CPS/Handbook  ratio  and  deriving  the  algebraic  difference 
between  the  original  Handbook  estimate  and  adjusted  Hand- 
book estimates.  The  correction  factor  is  added  lo  each  month's 
current  Handbook  estimate  to  derive  a  total  unemployment 
estimate  linked  to  the  previous  year's  CPS. 

It  can  be  shown  that  this  is  the  mathematical  equivalent  of 
extrapolating  the  previous  December's  benchmarked  estimate 
by  the  arithmetic  change  in  the  current  year's  monthly 
Handbook  estimates.  In  essence,  the  monthly  Handbook  trends 
are  accepted  as  a  reasonable  reflection  of  the  underlying  trend, 
but  is  calibrated  to  the  annual  level  of  the  CPS  to  correct  for 
bias  due  to  differences  in  the  Ul  laws. 

Several  other  extrapolator  procedures  were  tested  before 
adopting  the  arithmetic  method.  The  multiplicative  method 
(ratio  adjustment)  proved  to  be  less  effective  in  most  States 
than  the  arithmetic  method  over  the  full  range  of  the  business 
cycle.  However,  in  some  States  the  multiplicative  method 
proved  to  be  better.  It  was  decided  to  adopt  a  uniform  method, 
based  on  the  use  of  the  data  in  allocating  Federal  funds  and  the 
possible  legal  consequences  of  using  different  formulas. 

The  third  stage  of  benchmarking  involves  the  correction  of 
the  Handbook  estimates  and  forcing  the  corrected  monthly 
estimates  to  average  to  the  CPS  annual  average.  The  corrections 
are  developed  by  a  mathematical  procedure  which  is  designed  to 
minimize  disruption  to  the  month-to-month  change  while 
forcing  consistency  between  the  monthly  estimates  and  the 
annual  average. 

First,  the  Handbook  estimates  are  adjusted  by  the  annual 
CPS/Handbook  ratio.  Second,  the  previous  year's  annual  ratio  is 
compared  with  the  current  year's  ratio  and  the  difference 
between  the  two  ratio's  is  wedged  over  the  24  month  period. 
This  procedure  insures  that  the  difference  between  the  two 
ratios  is  not  reflected  solely  'n  the  December  (previous  year) 
and  January  (current  year)  trend.  The  wedging  is  accomplished 
by  assigning  weights  to  each  of  the  months  and  developing  a 
composite  ratio  which  is  used  to  adjust  the  Handbook  estimates 
for  that  month.  Finally,  the  adjusted  monthly  estimates  are 
forced  to  average  to  the  CPS  annual  averages. 

The  interpolation  wedging  procedure  described  above  in- 
volves the  pairing  of  two  successive  years.  Thus,  each  year  is 
paired  twice,  once  with  the  preceding  year,  and  once  with  the 
succeeding  year.  In  essence  three  estimates  are  derived  for  every 
month  before  the  estimates  become  final.  The  first  is  derived  on 
a  current  basis  by  extrapolation.  The  second  is  derived  th'  lUgh 
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interpolating  the  current  year  with  the  previous  year,  as  a  result 
of  benchmarking  to  the  current  year's  annual  CPS  average.  The 
third  is  derived  through  interpolating  the  current  year  v^^ith  the 
succeeding  year,  as  a  result  of  benchmarking  to  the  succeeding 
year's  annual  CPS  average. 


Analysis 

The  acid  test  of  the  effectiveness  of  the  extrapolation 
procedure  is  success  or  lack  of  success  in  predicting  the 
annual  CPS  estimate.  The  underlying  hypothesis  is  that  there  is 
less  bias  in  the  Handbook  trends  than  there  is  in  the  Handbook 
levels,  and  that  the  bias  in  trends  is  relatively  consistent.  This 
stems  from  the  fact  that  the  major  cause  of  bias  in  the 
Handbook  is  the  different  State  Ul  lava's  and  that  the  differences 
are  greater  across  States  than  they  are  across  time  for  any  one 
State. 

Table  3  shoves  the  mean  percentage  revision  required  to 
benchmark    to    the   CPS    using   three   different  extrapolation 


methods.  The  percentage  revision  is  derived  using  the  extra- 
polator  as  a  base  rather  than  the  more  conventional  method  of 
using  the  CPS  as  a  base.  Thus,  the  percentage  revision  is 
expressed  as  (C-X)  H-  X  where  C  equals  the  CPS  and  X  equals 
the  extrapolator. 

Of  the  27  largest  States  for  which  historical  data  are 
available,  18  States  experienced  a  smaller  revision  using  the 
current  extrapolator  than  using  the  Handbook  estimates 
directly.  The  average  percentage  revision  among  all  States  was 
14.8  percent  for  the  Handbook  and  9.5  percent  for  the  current 
extrapolator.  This  would  appear  to  confirm  the  hypothesis  that 
the  Handbook  trends  are  less  biased  than  the  Handbook  levels. 

The  second  assumption,  concerning  the  consistency  of  the 
Handbook  bias,  however,  has  proved  to  be  false  in  most  States. 
The  amount  of  revision  required  after  extrapolation  was  far 
greater  at  9.5  percent  than  the  average  coefficient  of  variation 
of  the  CPS  sample  estimate  at  5.3  percent.  If  the  bias  were 
perfectly  consistent  across  time  in  each  State,  the  amount  of 
revision  due  to  extrapolation  would  not  exceed  the  sampling 
error  of  the  CPS. 


Table  3.  Mean  Percentage  Revision  by  Type  of  Extrapolation  for  27  States:   1971-1976 


State 


Coefficient 

of 

variation 


Handbook 
estimate 


Current 
extra- 
polator^ 


MVA 
extra- 
polator^ 


California 

New  York 

Pennsylvania 

Illinois 

Ohio 

Massachusetts 

New  Jersey 

Texas 

Michigan 

Florida 

Indiana 

Missouri 

Georgia' 

Wisconsin 

Connecticut ' 

North  Carolina 

Tennessee ' 

Minnesota' 

South  Carolina' 

Louisiana' 

Washington 

Alabama ' 

Kentucky ' 

Ok  1 ahoma  ' 

Oregon  ' 

Virginia 

Maryland 

'Estimate  derived  by  annual  extrapolator. 
Bstimate  derived  by  monthly  moving  average  extrapolator, 
'iJata  derived  from  monthly  CPS  estimates  for  1973-1976. 
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It  is  interesting  to  note  that  in  7  States,  the  percentage 
revision  required  after  extrapolation  was  less  than  the  sampling 
error  of  the  CPS.  This  suggests  that  in  these  States  the  bias  was 
consistent  across  time,  and  served  to  lower  the  mean  square 
error  of  the  difference  between  the  extrapolated  measure  and 
the  actual  CPS. 

In  theory,  the  mean  square  error  of  the  difference  between 
two  independent  estimates  will  equal  the  sum  of  the  mean 
square  errors  of  each  estimate.  Thus,  if  the  mean  square  error 
associated  with  the  extrapolator  (Handbook  trend)  is  about  the 
same  in  each  State,  the  amount  of  revision  required  to 
benchmark  to  the  CPS  would  be  proportional  to  the  size  of  the 
CPS  sampling  error.  From  the  data  shown  in  table  3  where  the 
States  are  listed  in  order  by  size  of  sampling  error  coefficient  of 
variation  (CV),  it  seems  quite  clear  that  there  is  little  or  no 
relationship  between  the  size  of  the  revision  and  the  size  of  the 
sampling  error.  Correlating  the  percentage  revision  with  the  CV 
using  both  Spearman's  and  Pearson's  correlation  method  yielded 
coefficients  which  proved  not  to  be  significant  at  the  90  percent 
confidence  limits. 

The  conclusion  one  must  draw  therefore,  is  that  increasing 
the  CPS  sample  size  to  reduce  the  annual  CV  would  not 
significantly  affect  the  size  of  the  annual  revision  required  from 
benchmarking  to  the  CPS.  Moreover,  this  lack  of  convergence 
between  the  Ul  trends  and  the  CPS  trends  raises  serious  doubts 
about  the  use  of  U I  data  as  economic  measures.  It  also  suggests 
that  the  CPS  estimates  may  be  biased  in  some  States  due  to 
nonsampling  errors  and  that  the  mean  square  error  of  the 
estimate  is  somewhat  larger  than  the  sample  variance. 


IMPROVING  THE  UNEMPLOYMENT  ESTIMATES 

CPS  Expansion 

In  view  of  the  lack  of  convergence  between  Ul  data  and  CPS 
unemployment  data  the  most  effective  means  of  improving 
State  unemployment  estimates  appears  to  be  the  direct  replace- 
ment of  the  former  by  the  latter.  At  present  the  CPS  sample  is 
adequate  to  provide  annual  unemployment  estimates  for  50 
States  and  30  SMSA's.  Any  move  for  sample  expansion  must 
choose  between  adding  more  areas  or  shortening  the  time  period 
between  benchmarks,  from  annually  to  monthly  or  perhaps 
quarterly. 

In  1975,  BLS  proposed  an  expansion  which  would  add  more 
areas,  while  retaining  the  main  features  of  the  annual  bench- 
marking procedures.  This  choice  was  dictated  by  the  needs  of 
the  existing  CETA  legislation  where  allocations  were  made 
primarily  on  annual  average  estimates  for  large  cities  and 
counties  located  in  SMSA's  and  "balances  of  States,"  defined  as 
the  combination  of  counties  outside  SMSA.  After  2  years  of 
planning  and  sample  selection,  this  expansion  plan  was 
abandoned  because  of  a  major  shift  in  legislative  emphasis  and 
the  BLS  experience  with  annual  benchmarking  which  has 
resulted  in  unacceptably  large  ex  poste  revisions. 


The  emergence  of  the  steepest  recession  in  30  years,  caused 
the  Congress  to  modify  the  CETA  legislation  from  a  training 
program  to  deal  with  structural  unemployment  to  a  program  to 
provide  counter-cyclical  assistance  to  small  communities  in  the 
form  of  public  service  jobs.  Supplemental  appropriations  were 
provided  under  CETA  and  allocations  were  based  on  the  most 
recent  monthly  data.  Additional  legislation  created  other 
counter-cyclical  assistance  in  the  form  of  local  public  works  and 
revenue  sharing  with  formulas  targeted  to  the  use  of  monthly 
unemployment  data.  All  of  these  actions  suggested  that  BLS 
should  shift  priorities  to  getting  more  accurate  monthly  data  at 
the  State  level  and  concentrate  on  developing  appropriate 
estimating  methodologies  to  disaggregate  the  State  CPS  esti- 
mates to  local  area  estimates  through  the  use  of  Ul  adminis- 
trative data. 

BLS  is  planning  to  expand  the  CPS  sample  initially  to 
provide  monthly  unemployment  estimates  for  States  with  a  10 
percent  CV  on  the  unemployment  level,  and  eventually  to 
provide  data  with  a  monthly  CV  of  7.5  percent.  The  ultimate 
success  of  this  expansion  will  depend,  of  course,  on  the  ability 
to  obtain  financial  resources  from  the  Congress  and  the  ability 
of  the  Bureau  of  Census  to  absorb  the  enormous  workload. 

If  legislation  did  not  dictate  the  need  for  monthly  estimates, 
a  much  more  cost-effective  method  of  expanding  the  sample 
would  be  to  obtain  quarterly  data  for  each  State.  Policy 
measures  requiring  fiscal  actions  or  other  actions  to  deal  with 
unemployment  are  rarely  considered  by  State  governments  on  a 
real  time  basis.  As  a  practical  matter  State  legislatures  are  not  in 
session  throughout  the  year  to  vote  on  such  actions.  However, 
in  view  of  the  existing  legislation  which  requires  monthly  data, 
an  expansion  of  the  sample  to  provide  only  quarterly  data  could 
make  matters  worse  by  requiring  more  frequent  revisions  of  the 
monthly  estimates  and  a  great  loss  of  public  confidence  in  the 
unemployment  estimates. 

Improving  the  Ul  Data  Base 

On  the  surface,  it  would  appear  that  shifting  emphasis  to 
obtain  monthly  CPS  estimates  at  the  State  level  would  alleviate 
the  need  for  improved  Ul  data.  However,  monthly  estimates  of 
unemployment  are  required  by  legislation  for  all  counties  in  the 
United  States  and  cities  of  25,000  population  or  more.  The  only 
viable  means  of  obtaining  these  data  in  lieu  of  a  monthly  census, 
is  to  use  the  administrative  records  of  the  State  unemployment 
insurance  systems. 

In  discussing  the  Ul  data  base,  I  pointed  to  three  areas  where 
current  operating  procedures  followed  by  State  employment 
security  agencies  yield  byproduct  data  which  are  incompatible 
with  CPS  concepts.  These  areas  concern  are:  (1)  the  time 
reference  of  the  data,  (2)  the  geographic  reference  of  the  data, 
and  (3)  the  definition  of  unemployment  for  claimants  with 
earnings.  Each  of  these  three  areas  can  be  improved  by 
appropriate  coding  of  microdata  and  systematic  com- 
puterization of  the  statistical  reporting  system. 

The  BLS  has  invested  almost  3  million  dollars  in  the 
improvement  of  the  Ul  data  base  in  about  40  States  which  have 
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contracted  with  the  Department  of  Labor.  The  remaining  States 
appear  to  be  able  to  deliver  the  needed  data  without  additional 
financial  assistance.  By  January  1978,  or  shortly  thereafter,  all 
States  will  be  in  a  position  to  provide  Ul  data  by  county  of 
residence  and  most  States  will  have  exact  counts  of  the  number 
of  claimants  unemployed  including  the  appropriate  reference 
week.  With  this  accomplished,  the  BLS  will  be  able  to  introduce 
improved  methods  for  estimating  unemployment  at  the  county 
level. 

MORE  ACCURATE  ESTIMATING  MODELS 

The  lack  of  success  of  the  Handbook  method  to  estimate 
total  unemployment  as  well  as  the  frustrating  experience  of 
other  researchers  in  improving  the  70-step  model  has  led  the 
BLS  to  believe  that  an  entirely  new  estimating  system  is  needed. 
In  an  effort  to  improve  the  estimating  methodology,  a  multiple 
regression  model  has  been  developed  for  each  State.  The  model 
defines  the  monthly  CPS  level  of  unemployment  as  a  linear 
function  of  a  count  of  the  insured  unemployed,  excluding 
claimants  receiving  partial  benefits,  a  count  of  the  most  recent 
three  months  of  benefit  exhaustees;  an  exogenous  estimate  of 
total  employed;  and  monthly  dummy  variables  designed  to  pick 
up  seasonal  influences  on  unemployment  (mainly  the  seasonal 
impact  of  the  entry  of  new  workers)  not  otherwise  explained  by 
the  independent  variables. 

Aside  from  the  attractive  operational  features  of  a  regression 
model,  this  approach  was  designed  to  minimize  errors  resulting 
from  estimating  inputs  and  compounding  errors  in  a  building 
block  procedure.  Specific  error  measures  for  each  State  would 
also  be  provided  in  order  to  evaluate  the  effectiveness  of  each 
State  model.  In  a  preliminary  report  of  progress  on  this 
regression  technique,  the  BLS  found  that  the  regression  equa- 


tions made  better  predictions  of  CPS  levels  of  unemployment 
than  the  estimates  developed  from  the  Handbook  method  in  the 
15  largest  States  for  which  monthly  CPS  historical  data  were 
available.  However,  subsequent  research  indicates  that  the 
regression  technique  does  not  predict  annual  CPS  trends  as  well 
as  the  Handbook,  and,  therefore,  would  not  be  an  effective 
substitute  for  the  Handbook,  in  terms  of  reducing  the  size  of 
the  annual  revision. 

Additional  work  on  refining  the  regression  technique  is 
underway  in  the  BLS.  More  emphasis  will  be  placed  on  methods 
of  disaggregating  State  CPS  estimates  using  local  area  Ul  data 
rather  than  continuing  the  current  practice  of  developing 
independent  area  estimates  and  forcing  consistency  between  the 
sum  of  the  areas  and  the  State  estimates. 

Another  method  for  developing  more  accurate  unemploy- 
ment estimates  is  to  modify  the  current  extrapolation  pro- 
cedures to  take  advantage  of  monthly  CPS  data.  Current 
procedures  extrapolate  unemployment  estimates  by  applying 
correction  factors  based  on  the  previous  years  CPS  annual 
average.  The  correction  factor  which  is  developed  reflects  the 
CPS  and  Handbook  relationship  derived  from  12  months  of  data 
ending  in  December.  Since  the  correction  factor  remains  fixed 
and  is  extrapolated  over  the  next  12  months,  the  implicit 
assumption  is  that  the  Handbook  bias  is  consistent.  This 
assumption  has  of  course  proved  to  be  false. 

By  applying  a  new  correction  factor  each  month,  based  on  a 
moving  average  of  the  previous  12  months  of  data,  it  can  be 
shown  that  the  size  of  the  year-end  revision  due  to  CPS 
benchmarking  can  be  significantly  reduced.  As  shown  in  table  3 
the  mean  percentage  revision  associated  with  the  moving  average 
extrapolator  (MVA)  is  reduced  in  all  but  one  of  the  27  States. 
More  recent  research  in  BLS  shows  that  the  use  of  a  6  months 
moving  average  is  even  more  effective  and  will  not  destroy  the 
underlying  seasonal  movements  of  the  Handbook  estimates. 
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A  Brief  Description  of  County  Business 
Patterns-Past,  Present,  and  Future 

Robert  Schiedel 
Bureau  of  the  Census 

One  of  the  earliest  users  of  administrative  records  to  obtain 
statistical  data  is  the  County  Business  Patterns  (CBP)  program. 
This  series  started  the  cooperative  work  of  several  government 
agencies  and  private  industry  sharing  the  responsibilities  and 
costs  of  providing  urgently  needed  statistical  summary  data  of 
an  economic  nature  at  the  county  level. 

County  Business  Patterns  is  a  product  derived  from  admini- 
strative reports  supplemented  with  statistical  inputs  from  the 
mail  survey  of  selected  multiestablishment  employers  not 
reporting  in  sufficient  county  and  kind-of-business  detail  for 
CBP  purposes.  Statistical  use  of  administrative  reports  was 
recommended  in  the  1948  report  of  the  National  Bureau  of 
Economic  Research  to  the  Hoover  Commission  on  Organization 
of  the  Executive  Branch  of  the  Government  and  in  the  1954 
report  of  the  Intensive  Review  Committee  to  the  Secretary  of 
Commerce. 

PAST  COUNTY  BUSINESS  PATTERN 
PROGRAM 

The  first  CBP  reports  provided  employment  and  payroll  data 
covering  the  first  quarter  of  1946.  Data  were  provided  for 
Standard  Industrial  Classification  (SIC)  economic  divisions  and 
major  groups  for  the  United  States,  for  each  State,  and  each 
large  county.  SIC  economic  division  totals  were  published  for 
small  counties  and  a  combination  of  some  counties  for  eight 
States.  After  1946,  CBP  was  published  intermittently  until  1964 
at  which  time  it  became  an  annual  publication. 

The  CBP  program  from  1949  through  1973  represented  a  joint 
effort  of  the  Social  Security  Administration  (SSA)  and  the 
Bureau  of  the  Census  to  provide  economic  statistics  that  the 
business  community.  State,  and  local  county  government 
utilize  in  a  wide  variety  of  ways.  Statistics  were  provided  on 
reporting  units,  first  quarter  Federal  Insurance  Contribution  Act 
(PICA)  taxable  payroll  and  mid-March  pay  period  employment 
by  4-digit  SIC  industry  classification  and  county  location.  The 
data  are  primarily  derived  from  (SSA's)  employment  and  tax- 
able payroll  information  for  employer  identification  (El)  num- 
bers assigned  in  connection  with  PICA,  reported  on  Treasury 
Form  941,  Schedule  A. 

Each  legal  entity— corporation,  partnership,  single  proprietor- 
ship, etc.— is  required  to  file  a  separate  Employer's  Quarterly 
Federal  Tax  Return,  Treasury  Form  941,  regardless  of  affili- 
ation, stock  ownership,  or  control.  The  IRS  Form  SS-4, 
Application  for  Employer  Identification  Number  requests 
physical  location  and  classification  information  which  allows  for 
the  initial  assignment  of  the  geographic  and  SIC  codes  by  the 


SSA.  The  Schedule  A  portion  of  Form  941  requests  March 
12th,  pay  period  employment  in  addition  to  the  taxable  payroll 
data.  These  data  are  the  major  source  of  administrative  data 
used  to  compile  CBP.  Supplemental  information  was  obtained 
annually  from  some  4,000  multiestablishment  employers  in  a 
special  survey  conducted  jointly  by  the  Census  Bureau  and  the 
SSA.  Those  companies  which  failed  to  report  on  Form  941, 
Schedule  A,  in  sufficient  detail  for  CBP  purposes  were  canvassed 
for  additional  information. 

Due  to  the  SSA's  method  of  collecting  data  (Establishment 
Reporting  Plan),  the  statistics  in  CBP  were  tabulated.in  terms  of 
"reporting  units."  The  Establishment  Reporting  Plan  applies  a 
standard  system  to  the  reporting  procedures  of  multiestablish- 
ment employers  so  that  the  needed  statistical  information  can 
be  obtained  with  a  minimum  reporting  burden  on  these 
employers.  Under  the  Establishment  Reporting  Plan  system,  a 
nonmanufacturing  employer  is  allowed  to  aggregate  employ- 
ment and  payroll  data  for  a  number  of  establishments  with 
identical  industrial  activity  in  the  same  county  and  supply  the 
data  for  the  establishment  group  as  a  whole,  called  a  "reporting 
unit."  Thus,  detailed  data  for  each  physical  location  of  a 
business  are  not  always  available.  All  manufacturing  establish- 
ments report  data  separately  even  if  two  or  more  are  in  the  same 
industry  in  the  same  county. 

County  Business  Patterns  programs  prior  to  the  1974 
publication  reflected  data  on  first  quarter  FICA  taxable  payroll 
and  mid-March  pay  period  employment  as  reported  Social 
Security  Administration.  The  data  in  the  publication  represent 
the  following  types  of  employment  covered  by  the  FICA:  (a) 
All  covered  wage  and  salary  employment  of  private  nonfarm 
employers  and  of  nonprofit  membership  organizations  under 
compulsory  coverage  and,  (b)  all  employment  of  religious, 
charitable,  educational,  and  other  nonprofit  organizations 
covered  under  the  elective  provisions  of  the  FICA. 

The  following  types  of  employment,  whether  covered  in 
whole  or  in  part  by  the  Social  Security  program,  are  excluded 
from  CBP  tabulations:  Government  employees,  self-employed 
persons,  farm  workers,  domestic  service  workers,  railroad 
employment  subject  to  the  Railroad  Retirement  Act,  and 
employment  on  ocean-borne  vessels. 

Mid-March  pay  period  employment  is  the  count  of  em- 
ployees during  the  pay  period  that  includes  March  12th,  of  the 
subject  year,  as  reported  on  the  quarterly  Treasury  Form  941, 
Schedule  A. 

First  quarter  FICA  taxable  payrolls  are  defined  as  the 
amount  of  taxable  wages  paid  for  covered  employment  during 
the  January -March  quarter.  Wages  in  excess  of  the  taxable  limit 
are  not  reflected  in  CBP  data.  The  specific  limit  on  the  amount 
of  wages  that  are  subject  to  the  FICA  withholding  tax  is  raised 
frequently.  For  1973  the  amount  of  wages  subject  to  FICA 
withholding  was  $10,800. 

In  order  to  provide  geographic  detail,  reporting  units  are 
assigned  county  geographic  codes  on  the  basis  of  the  physical 
location  of  the  establishments  represented  by  the  reporting 
units.  Reporting  units  without  a  fixed  location  within  a  State  or 
of  unknown  county  locations  are  tabulated  separately  under  a 
"statewide"  classification  for  each  State. 
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The  entire  business  of  individual  employers  with  50  em- 
ployees or  less  is  regarded  by  SSA  as  a  single  reporting  unit 
regardless  of  the  county  and  industry  distribution  of  the 
operation.  However,  the  data  in  CBP  for  such  employers  are 
distributed  by  county  and  industry  based  on  individual  estab- 
lishment records  available  from  the  most  current  economic 
census. 

The  disclosure  provisions  of  the  bureau,  in  accordance  with 
Federal  regulations,  provide  that  data  may  not  be  released  if  the 
operations  of  an  individual  employer  might  be  disclosed.  The 
number  of  reporting  units  and  their  distribution  by  employ- 
ment-size class  are  not  considered  a  disclosure  and  these  items 
may  appear  in  instances  where  employment  and  payroll  data  are 
withheld.  Data  are  not  shown  separately  for  any  industry  that 
does  not  have  at  least  TOO  employees  or  10  reporting  units  in 
the  area-United  States,  State,  or  county -cove  red  by  the 
tabulation.  Data  for  an  unpublished  industry  are  included  in  the 
total  shown  for  the  broader  industry  group  of  which  it  is  a  part. 

The  1973  County  Business  Patterns  publication  was  the  final 
edition  under  these  described  concepts. 

PRESENT  COUNTY  BUSINESS  PATTERN 
PROGRAM 

The  introduction  of  the  Standard  Statistical  Establishment 
List  (SSEL)  system  provided  the  basis  for  the  expansion  of  the 
1974  CBP  program.  The  development  of  the  SSEL  constitutes  a 
major  step  toward  modernizing  and  standardizing  the  means  by 
which  various  statistical  programs  of  the  government  collect  and 
process  information.  The  SSEL  consists  of  a  central  multi- 
purpose computerized  name,  address  file,  and  El  number  of  all 
domestic  establishments  with  employees  covered  by  PICA  in 
the  economic  sector.  Each  name  and  address  record  on  the  file 
contains  a  set  of  standardized  data  and  information  items. 

For  single  units  firms,  the  data  in  the  new  series  starting  with 
1974  are  obtained  from  administrative  records  of  the  IRS  as 
well  as  from  administrative  records  of  the  SSA.  More  specifi- 
cally, payroll  data  by  El  number  are  obtained  from  the 
quarterly  IRS  Treasury  Form  941,  filed  by  employers  for 
withholding  payroll  for  income  tax;  and  employment  data  from 
SSA  Form  941,  Schedule  A,  filed  by  employers  for  Social 
Security  tax  purposes. 

Individual  establishment  data  for  multi-location  firms  are 
provided  by  the  Bureau's  Annual  Company  Organization  Survey 
which  is  an  integral  part  of  the  SSEL.  Each  year  the  Census 
Bureau  mails  a  Company  Organization  Report  to  all  known 
multicstablishment  companies  with  50  employees  or  more.  The 
smaller  companies  are  mailed  report  forms  about  every  3  years 
on  a  rotating  basis.  These  reports  prelist  the  name  and  addresses 
of  the  companies'  establishments  and  request  an  update  of  the 
establishment  listing.  The  companies  are  asked  to  report 
employment  and  payroll  data  and  indicate  a  kind  of  business  or 
dddress  change,  if  any,  occurring  during  the  year. 

Initially,  all  employers  are  assigned  industry  and  county 
cidssifications  by  the  SSA  on  the  basis  of  nature  of  business  and 
locatif^n  information  supplied  on  their  applications  for  El 
numbers  rjr  by  a  supplementary  inquiry. 


The  Bureau  of  the  Census  also  assigns  industry  and  county 
classifications  to  a  substantial  number  of  establishments  can- 
vassed in  its  economic  censuses  and  current  economic  programs 
on  the  basis  of  information  on  actual  physical  location  and 
major  activity.  Industry  classifications  for  1974  and  1975  are 
based  on  the  1972  edition  of  the  SIC  manual  issued  by  the 
Executive  Office  of  the  President,  Office  of  Management  and 
Budget. 

CBP  is  now  tabulated  on  an  establishment  basis  as  opposed 
to  the  reporting  unit  concept  used  in  the  prior  years.  The 
establishment,  a  single  physical  location  at  which  business  is 
conducted,  is  generally  considered  to  be  the  smallest  basic  unit 
for  which  key  figures  of  economic  activity  are  available.  The  use 
of  the  establishment  concept  provides  for  a  more  detailed  and 
definitive  level  of  data  publication  not  possible  under  the 
former  reporting  unit  basis.  Additionally,  the  use  of  the 
establishment  as  the  basic  unit  of  economic  activity  will  allow 
consistency  with  other  major  data  series  currently  conducted  by 
the  Census  Bureau. 

The  present  CBP  program  provides  summary  data  on  an 
annual  basis  on  number  of  employees  for  the  week  including 
March  12th,  first  quarter  total  payroll,  total  annual  payroll,  and 
number  of  establishments  by  employment-size  class.  To  market- 
ing people  interested  in  a  measurement  of  outlets,  the  establish- 
ment count  will  be  a  significant  improvement.  Data  will  be 
published  for  over  700  detailed  kinds  of  businesses  based  on  the 
1972  SIC  manual  providing  the  business  and  industrial  makeup 
of  each  of  the  more  than  3,000  counties  and  county  equivalents 
in  the  United  States.  In  addition,  summary  data  are  provided  for 
each  State  and  the  United  States.  Summary  lines  with  less  than 
50  employees  are  not  shown  separately  but  are  included  at  the 
next  broader  SIC  group. 

The  publishing  of  data  on  total  quarterly  and  annual  payroll 
overcomes  one  of  the  inherent  drawbacks  of  the  earlier  CBP 
program  which  was  limited  to  first  quarter  FICA  taxable 
payroll.  The  inclusion  of  total  quarterly  and  annual  payroll 
provides  a  needed  measure  of  continuity  with  other  data  series 
such  as  the  bureau's  economic  censuses  and  the  IRS  "Statistics 
of  Income."  Total  annual  payroll  will  aid  in  the  identification  of 
those  seasonal  operations  not  having  payroll  values  for  the  first 
quarter,  an  advantage  not  realized  in  prior  years'  CBP. 

The  CBP  data  base  consists  of  some  5.4  million  establishment 
records-4.4  million  single  unit  records  and  about  1  million 
multicstablishment  records.  The  single  unit  data  from  the  IRS 
and  the  SSA  are  processed  separately  from  the  multiunit 
records.  Single  unit  records  are  inititally  submitted  to  a 
computer  check  for  the  omission  of  employment  and  payroll 
data.  Employment  data  not  present,  are  estimated  based  on  the 
employment  and  payroll  ratios  of  those  employer  records  with 
reported  data.  Ratios  are  developed  at  the  State  2-digit  SIC  level 
for  various  employment  size  classes.  Quarterly  payroll  is 
estimated  based  on  the  reporting  patterns  of  eight  quarters  of 
payroll.  In  addition,  the  IRS  Form  941  payroll  data  are 
compared  to  the  SSA  Form  941,  Schedule  A  and  FICA  taxable 
payroll  with  the  difference  reconciled  when  needed. 

The  multicstablishment  records  are  reconciled  on  a  company 
basis  to  the  administrative  record  data  reported  to  the  IRS  and 
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SSA.  Year-to-year  comparisons  arc  made  at  the  establishment 
level  and  estimates  are  made  for  nonreported  data  based  on 
prior  year  information,  administrative  record  data,  or  industry 
group  averages  depending  on  the  data  item(s)  being  estimated. 

Following  the  initial  edits,  single  unit  records  having 
$100,000  or  more  in  annual  payroll  and  all  multiestablishment 
records  with  impossible  geographic  and  SIC  codes  are  printed 
for  analytical  review  and  correction.  The  single  unit  records  and 
multiestablishment  records  are  merged  and  sorted  into  area  for 
SIC  sequence  in  preparation  for  a  cell  edit.  The  summary  cell 
edit  tests  the  ratio  of  current  year  to  prior  year  data  entries  for 
four  data  items:  Total  number  of  establishments,  total  first 
quarter  employment,  total  first  quarter  payroll,  and  total  annual 
payroll.  Each  data  ratio  is  tested  against  a  unique  acceptance 
interval.  An  acceptance  interval  is  computed  for  each  data  ratio 
within  each  2-digit,  3-digit,  and  4-digit  SIC  group  within  each 
county  of  each  State.  The  computation  of  the  acceptance 
intervals  is  an  involved  process  requiring  many  different 
calculations.  The  acceptance  intervals  are  constructed  in  such  a 
manner  that  the  larger  summary  cells  are  more  likely  to  fail  the 
edit  than  smaller,  less  significant  summary  cells.  In  general,  the 
larger  the  magnitude  of  the  summary  cell  data  entry  in  the  prior 
year,  the  smaller  the  potential  for  relative  change  in  the  current 
year  entry.  However,  the  larger  cells  are  more  statistically 
significant.  Therefore,  a  weighting  scheme  is  utilized  which  gives 
a  greater  weight  to  ratios  of  larger  cells.  Any  summary  cell 
which  has  one  or  more  data  ratios  falling  outside  its  appropriate 
acceptance  interval  is  flagged  as  an  outlier.  The  outlier  cell  is 
displayed  on  a  summary  cell  outlier  listing  if  the  summary  cell 
has  fifty  employees  or  more  in  either  the  current  or  prior  year. 

To  assist  the  analyst  in  determining  the  cause  of  a  summary 
cell  data  ratio  to  fail  the  edit,  the  five  establishments  which  are 
most  responsible  for  the  data  ratio  failing  the  acceptance  limit 
are  displayed  immediately  below  the  outlier  cell.  The  expecta- 
tion is  that  the  displayed  establishments  will  isolate  most  of  the 
problems  affecting  the  summary  cell,  minimizing  the  amount  of 
research  required.  The  individual  establishment  data  are  avail- 
able on  microfilm  for  research  purposes  in  the  event  the  five 
displayed  establishments  are  not  the  cause  for  the  edit  failure. 
All  multiestablishment  records  are  on  microfilm  in  addition  to 
those  single  unit  estatblishments  records  with  $1 25,000  or  more 
annual  payroll. 

After  the  cell  edit  review  the  establishment  data  are 
corrected  and  summarized  for  the  publication  tables.  After  the 
publication  tables  are  reviewed,  corrections  are  carried  to  the 
tables  prior  to  sending  the  reports  for  printing.  The  latest 
County  Business  Patterns  publications  cover  the  year  1975.  We 
anticipate  that  the  1976  publication  will  be  available  in  early 
1978. 

FUTURE  COUNTY  BUSINESS  PATTERN 
PROGRAM 

The  County  Business  Patterns  project  will  be  periodically 
changed  in  order  to  provide  statistical  data  to  complement  the 
ongoing  current  economic  statistical  programs.  The  Bureau  of 
the   Census  has  been  designated  by   the   Executive  Office  of 


President,  Office  of  Management  and  Budget  as  the  agency 
responsible  for  maintenance  of  the  SSEL  and  for  providing 
other  Federal  agencies  with  access  to  the  SSEL  for  sampling  and 
other  statistical  purposes.  Access  to  individual  establishment 
data  is  not  yet  possible  due  to  legislative  restrictions.  In 
addition,  the  Bureau  of  the  Census  is  also  responsible  for 
providing  summary  data  which  will  provide  those  agencies  using 
the  SSEL  file,  and  all  other  data  users,  with  equivalent 
benchmark  "control  totals"  of  the  universe  for  their  individual 
programs. 

Consideration  is  currently  being  given  to  additional  changes 
that  will  expand  the  utility  of  this  program.  In  order  to  publish 
summary  data  for  places  with  a  population  of  100,000  or  more 
we  plan  to  geographically  code  the  entire  file  to  place.  Place 
codes  are  now  a  part  of  the  SSEL,  however,  these  records 
require  additional  research  in  order  to  assure  their  reliability. 
This  expansion  will  permit  the  publishing  of  data  for  places  with 
population  of  100,000  or  more  and  to  provide  special  tabula- 
tions by  place  and/or  ZIP  code  for  places  with  less  than  100,000 
population.  We  will  also  be  able  to  publish  SMSA's  for  New 
England  since  New  England  SMSA's  are  defined  in  terms  of 
places  rather  than  counties.  The  great  demand  for  current  data 
by  place  and  by  economic  activity  can  only  be  satisfied  by  CBP 
which  covers  practically  all  economic  activities.  Urban  planners, 
market  researchers,  and  all  others  associated  with  economic  and 
social  problems  in  urban  areas  are  continually  hindered  by  the 
lack  of  detailed  statistics  at  the  place  level.  With  this  new 
information,  they  will  be  better  able  to  prepare  small  area 
economic  studies,  profiles,  and  analyses  of  economic  activity. 

Consideration  is  currently  being  given  to  develop  computer 
edits  and  summary  routines  in  preparation  for  inclusion  of  sales 
and  receipts  data  as  an  additional  data  item  in  CBP.  Research 
and  evaluation  of  these  data  are  needed  to  evaluate  conceptual 
problems  concerning  sales  and  receipts  as  to  what  items  are  to 
be  included  or  excluded  and  what  is  readily  reportable  by 
finance  institutions,  real  estate  firms,  and  nonprofit  organiza- 
tions. 

At  present,  receipts  data  from  IRS  Statistics  of  Income  (on  a 
legal  entity  basis)  for  the  United  States  and  selected  SMSA's  are 
all  that  are  available  without  the  need  to  refer  to  more  than  one 
set  of  statistical  data  which  may  have  comparability  problems. 
The  two  indicators,  employment  and  payroll,  of  economic 
activity  currently  provided  have  several  drawbacks: 

-  they  fail  to  recognize  the  economic  impact  of  firms 
which  have  relatively  few  employees  yet  have  a  signifi- 
cant dollar  volume  of  business, 

-  they  do  not  provide  the  necessary  information  for  the 
construction  of  the  national  accounts.  This  is  especial- 
ly true  in  the  estimates  of  personal  consumption  expend- 
itures where  comprehensive  annual  data  on  the  service 
sector  of  the  economy  are  lacking,  and 

-  while  employment  and  payroll  may  be  good  proxies  for 
the  general  level  of  business  activity,  the  value  of  sales 
and  receipts  is  generally  a  more  sensitive  and  accurate  in- 
dicator, especially  during  swings  of  the  business  cycle. 
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For  example,  changes  in  level  of  a  firm's  sales  or  receipts 
are  generally  reflected  more  slowly  in  the  employ- 
ment levels. 

The  addition  of  sales  and  receipts  would  help  to  resolve  the 
mentioned  shortcomings  and  respond  in  a  very  meaningful  way 
to  the  expressed  need  for  more  useful  statistics.  In  general,  the 
inclusion  of  sales  and  receipts  data  in  CBP  would  provide  an 
accurate  identification  of  this  important  view  of  the  economic 
spectrum  and  would  increase  the  value  of  the  series.  Comparable 
data  would  be  provided  for  the  4-year  interval  between 
economic  censuses.  The  data  would  be  of  great  value  to  data 
users  in  general  and  specifically  to  the  Department  of  Com- 
merce, Bureau  of  Economic  Analysis,  in  determining  gross 
national  product  and  in  their  analysis  and  measurement  of 
economic  trends.  The  proposed  data  would  also  serve  to 
strengthen  existing  Census  Bureau  programs  by  providing  annual 
benchmarks  and  giving  a  basis  for  comparison  of  results. 

Sales  and  receipts  data  for  single  location  companies  would 
be  obtained  from  the  IRS  income  tax  returns.  Corresponding 
sales  and  receipts  data  for  individual  establishments  of  multi- 
establishment  firms  would  be  obtained  via  an  added  item  on  the 
annual  Company  Organization  Survey. 

The  proposed  program  would  provide  sales  and  receipts, 
March  12th  employment,  total  first  quarter  payroll,  total 
annual  payroll,  and  number  of  establishments  by  employment- 
size  class  to  the  4-digit  SIC  level  for  each  county  and  each  place 
with  100,000  inhabitants  or  more  in  the  United  States.  In 
addition,  summary  data  will  be  provided  at  the  U.S.,  State,  and 
SMSA  level. 

We  are  exploring  the  possibilities  of  including  in  CBP  data  for 
governments  and  for  exempt  nonprofit  organizations  which  are 
nonprofit  organizations  exempt  from  Social  Security  taxes.  The 


government  sector  represents  a  total  of  approximately  14 
million  employees  or  over  15  percent  of  the  labor  force.  This 
expanded  coverage  will  significantly  increase  the  use  value  of 
the  publication  especially  for  those  geographic  areas  with  a 
significant  amount  of  government  employment.  The  data  would 
provide  benchmark  figures  on  government  employment  and 
payrolls.  I 

Data  on  the  exempt  nonprofit  institutions  would  complete 
the  coverage  in  this  area.  This  would  help  resolve  the  existing 
problem  in  the  national  accounts  where,  for  lack  of  reliable 
data,  these  institutions  are  included  with  individuals  in  the 
personal  sector  of  the  income  and  product  accounts. 

Thus,  including  these  two  sectors  of  the  economy  will  go  a 
long  way  towards  making  the  CBP  a  fully  comprehensive  annual 
program. 

In  conclusion,  the  described  improvements  will  not  be  a 
simple  task  particularly  with  sales  and  receipts  because  of  the 
intricacies  involved  in  the  use  of  administrative  data  in  the  out 
of  scope  of  census  industries.  There  are  many  conceptual 
problems  which  must  be  confronted  and  resolved  as  to  what 
items  are  to  be  included  and  to  determine  what  data  are  readily 
reportable  by  finance,  insurance,  real  estate,  transportation,  and 
nonprofit  organizations.  There  are  also  problems  in  the  estab- 
lishment concept  in  the  out-of-scope  of  census  industries  which 
require  resolution.  Also,  we  anticipate  operational  problems  in 
the  matching  of  El  numbers  reported  in  the  Company  Organiza- 
tion Survey  to  the  IRS  both  with  the  El  numbers  and  the 
distribution  of  the  sales  and  receipts  data  on  an  establishment 
basis.  Once  these  problems  have  been  resolved,  and  this  may 
take  several  years,  we  expect  to  produce  a  much  improved  CBP 
which  will  provide  an  annual  census  of  economic  activity  of  the 
employer  universe. 
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A  Study  of  Federal  Taxes  and 
Grants  by  County 

Charles  Hicks  and  Peter  Sailer 
Internal  Revenue  Service 

As  employees  in  the  Statistics  Division  of  the  Internal 
Revenue  Services  (IRS)'  Vk'ho  are  involved  in  small-area  data, 
the  questions  we  are  asked  most  frequently  by  representatives 
of  State  and  local  governments  are:  How  much  do  the  citizens 
of  our  State  or  county  pay  in  taxes  to  the  Federal  Government? 
What  does  our  area  get  back  from  the  Federal  Government  in 
the  way  of  grants?  And,  how  do  these  data  compare  to  those  for 
other  counties,  given  our  size  and  economic  level? 

Until  very  recently,  we  have  been  unable  to  answer  questions 
about  taxes  paid  by  county  directly.  The  tax  return  did  not 
contain  an  indication  of  the  political  subdivision  in  which  the 
taxpayer  lived,  and  we  did  not  have  access  to  or  resources  to 
develop  a  system  to  approximate  county  codes  by  using  return 
addresses.  However,  beginning  with  tax  year  1972,  taxpayers 
were  asked  to  indicate  their  county  of  residence  on  their  tax 
returns.  The  need  for  this  information  arose  because,  under  the 
Revenue  Sharing  Act,  the  Census  Bureau  was  required  to 
provide  annual  updates  on  per  capita  income  by  counties. 
Taxpayer  response  with  this  request  for  location  information 
was  about  73  percent  in  the  first  year.  For  1974,  the  year  used 
in  this  study,  86  percent  of  all  taxpayers  filled  in  the  county 
box. 

Happily,  whatever  the  public  does  not  provide  us  with,  the 
Census  Bureau  does.  The  Census  Bureau  is  obliged  to  generate 
'the  missing  codes  for  its  own  purposes,  using  street,  city,  and 
ZIP  code,  and  the  Bureau  gives  this  information  back  to  us  at 
comparative  budget  prices.  As  a  result,  we  were  able  to  issue 
data  earlier  this  year  for  1972  on  number  of  returns  by  marital 
status;  major  types  of  income;  exemptions;  and  tax  liability,  for 
each  of  the  3,142  counties  and  other  political  subdivisions  in 
the  country  (see  reference  1).  Similar  data  for  1974  have  now 
been  prepared,  and  should  be  available  shortly  (see  reference  2). 

These  data  are  available  in  supplemental  reports  to  the 
Internal  Revenue  Service's  Statistics  of  Income  series.  Actually, 
they  are  not  strictly  comparable  to  those  published  in  regular 
Statistics  of  Income  reports,  since  the  latter  are  based  on  a 
specially  edited  and  transcribed  sample  of  returns,  while  the 
supplemental  reports  are  based  on  the  complete  IRS  file  of 
administrative  records.  The  major  advantage  of  using  this 
Individual  Master  File  (IMF)  is  obvious— to  avoid  sampling 
variability.  The  major  disadvantages  are  that  the  statistical  series 
is  limited  to  those  items  retained  on  the  IMF  for  administrative 
purposes,  and  that  Statistics  Division  has  no  control  over  the 

'  Any  opinions  expressed  in  this  paper  are  those  of  the  authors,  and  do 
not  reflect  official  positions  of  the  Internal  Revenue  Service.  While  the 
authors  accept  full  responsibility  for  all  opinions,  assumptions,  and 
conclusions,  they  would  tike  to  express  their  appreciation  to  Jack 
Blacksin,  Vito  Natrella,  Frederick  Scheuren,and  Robert  Wilson,  for  their 
helpful  comments,  as  well  as  Marianne  Ford  for  doing  the  typing. 


abstracting  and  perfecting  of  those  data  items  which  are  used.  A 
further  difference  between  the  two  files  is  that  the  IMF  is 
constantly  being  updated  as  amended  returns  and  audit  results 
become  available,  whereas  the  regular  statistical  sample  repre- 
sents the  original  returns  filed  by  taxpayers.  However,  table  1 
shows  that,  in  spite  of  the  differences  cited  above,  most  of  the 
data  items  did  not  differ  significantly  between  these  two  files. 

Corporate  Income  Taxes  and  Counties 

First,  some  of  our  callers  and  correspondents  also  want  to 
know  the  corporate  income  tax  burden  of  individual  counties. 
The  simple  answer  to  that  question  is  that  corporations 
generally  file  returns  where  their  national  headquarters  are 
located,  which  may  not  be  the  site  of  all  or  even  any  of  their 
productive  activities,  and  that  the  question  is  therefore 
unanswerable.  However,  this  may  not  be  relevant,  since  the 
taxes  imposed  on  a  corporation  are  presumably  passed  on,  not 
to  the  people  living  in  the  general  vicinity  of  its  operation,  but 
either  to  the  consumers  of  its  products  or  services  in  the  form  of 
higher  prices;  or  to  its  shareholders  in  the  form  of  lower 
dividends.  If,  in  fact,  the  taxes  are  passed  on  to  the  consumer, 
then  the  distribution  of  the  corporate  income  tax  burden  would 
be  very  similar  to  that  of  the  individual  income  tax,  since,  as 
will  be  shown  later,  the  individual  income  tax  is  strongly  related 
to  per  capita  money  income.  On  the  other  hand,  if  one  assumes 
a  passing  on  to  shareholders,  only  dividend  income  should  be 
used  to  distribute  the  corporate  tax  over  small  areas.  The  results 
making  either  one  of  these  assumptions— or  various  combina- 
tions thereof— are,  a  possible  subject  for  another  paper.  Suffice 
it  to  say  that  dividends  by  county  are  shown  in  IRS's  Small 
Area  Data  report,  and  could  easily  be  made  part  of  a  formula 
for  determining  the  corporate  tax  burden. 

Federal  Grants  and  Counties 

The  second  major  portion  of  the  question  our  colleagues  in 
the  county  governments  ask  us  is:  What  is  our  county  getting 
back  from  the  Federal  Government  in  the  way  of  grants?  Grant 
data  are  currently  available  at  the  county  level  in  the  series 
called  Federal  Outlays  (see  reference  3),  a  series  with  a  rather 
limited  distribution,  unlike  Statistics  of  Income,  it  is  not  for 
sale  by  the  U.S.  Government  Printing  Office;  however,  a  few 
Government  libraries  do  have  it.  The  data  in  this  series  are 
compiled  by  the  Community  Services  Administration  (CSA). 
The  word  "compiled"  is  used  advisedly,  since  CSA  is  more  or 
less  at  the  mercy  of  the  various  agencies  that  report  their 
expenditures.  Not  all  agencies  keep  track  of  their  expenditures 
by  program  by  county,  and  use  a  variety  of  procedures  to  break 
down  the  data.  Sometimes,  if  a  county-by-county  distribution 
of  the  target  population  is  available,  for  instance,  educationally 
deprived  children,  then  'the  State's  allocation  for  a  particular 
program  will  be  distributed  in  the  same  proportion  as  the  target 
population.  Meanwhile,  the  State  may  or  may  not  be  distri- 
buting the  funds  on  a  proportional  basis.  Nonetheless,  the  CSA 
data  represent  one  of  the  great  efforts  at  compiling  data  from 
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ESTIMATES,  SURVEYS,  AND  FORECASTS 


Table  1.  Statistics  of  Income  Reported  by  Size  of  Adjusted  Gross  Income  in  the  United  States 


(Percent ' 


Item 


Size  of  adjusted  gross  income 


AH 
returns 


Under 
$5,000 


$5,000 

under 

$10,000 


$10,000 

under 
$15,000 


$15,000 
or 

more 


Number  of  returns 

Number  of  joint  returns 

Number  of  exemptions: 

Tonal 

Dt  pendents 

Adji'Sted  gross  income 

S'    '.ries   and  wages: 

imber  of   returns 

•    lount 

D^  Idends  in  adjusted  gross  income: 

'imber  of  returns 

loun  t 

Ill:  rest: 

^    Tiber  of   returns 

/>.  roun  t 

wCal    tax: 

Number   of   returns 

Amount 


99.0 
100.8 


98.4 
97.1 

99.0 


98.3 
98.2 


98.9 
97,9 


98.9 
98.1 


98.4 
99.1 


99.2 
112.1 


98,6 
93.5 

97.4 


98.3 
96.9 


98.9 
97.7 


99.5 
100.7 


96.7 
96,6 


97.9 
101,3 


97.0 
95.2 

97.9 


97.8 
97,3 


97.6 
99.7 


97.6 
92,0 


97,7 
98.0 


99.6 
99.7 


98.8 
97,8 

99.5 


99.2 
98,9 


100,9 
103,3 


99.1 
101.2 


99,6 
99,5 


99,2 
99,3 


99,1 
98,8 

99.3 


98.4 
98.4 


94.1 
97.2 


99.3 
98.9 


99.2 
99.3 


diverse    sources   and,    when    used    with    all    the    precautionary 
measures,  are  very  helpful. 

County  Comparisons 

The  third  element  of  the  familiar  question  is:  How  does  our 
county  compare  to  other  counties  of  similar  size  and  economic 
coruition?  Per  capita  money  income  is  often  used  as  an 
ind.  ator  or  economic  condition.  Obviously,  it  is  an  imperfect 
mcujure  of  poverty  or  affluence.  Income  distributions-in 
particular  the  number  of  persons  below  the  poverty  line-might 
be  more  useful  in  measuring  need.  As  pointed  out  by  Charles 
Troob  in  these  proceedings  (see  reference  4),  Congress  has 
spi  ificd  a  wide  variety  of  criteria  and  data  bases  for  computing 
alit^cations  in  various  grant  programs.  Nonetheless,  the 
questions  we  arc  asked  by  county  governments  are  often 
formulated  in  terms  of  per  capita  income.  Therefore,  we 
decided  to  do  a  simple  correlation  study  to  determine  whether 
there  was  a  correlation  between  per  capita  income  on  the  one 
hand,  level  of  grants,  and  taxes  on  the  other.  Luckily,  per  capita 
income  data  at  the  county  and  sub-county  level  are  now 
available  annually  from  the  Census  Bureau  (sec  reference  5). 
Roger  Herriol,  in  another  paper  in  these  proceedings  (sec 
reference  6),  explained  in  great  detail  how  these  statistics  are 
derived.   One   important  source  used  was  the   IRS  Individual 


/\1aster  File-the  same  data  base  used  for  the  tax  statistics 
described  above. 

One  of  the  first  hurdles  we  had  to  overcome  was  the  sheer 
volume  of  data  involved  in  analyzing  a  number  of  Federal 
programs  in  3,142  counties.  Unfortunately,  we  did  not  have 
access  to  the  data  on  computer  tape— a  situation  that  may  be 
remedied  in  the  near  future.  Therefore,  we  decided  to  limit  our 
research  to  170  counties,  which  we  selected  from  four  strata 
based  on  per  capita  income;  these  are  low,  lower-middle, 
upper-middle,  and  upper  income.^  We  also  limited  our  corre- 
lation study  to  simple,  straight  line  regression. 

As  expected,  the  impact  of  the  Federal  individual  income  tax 
was  strongly  related  to  level  of  income.  The  correlation 
coefficient  was  .898  when  per  capita  taxes  were  compared  to 
per  capita  income.  On  the  other  hand,  the  correlation  appeared 
very  low  for  per  capita  grants  with  per  capita  income  with  a 
negative  .202.  Wc  then  decided  to  take  a  closer  look  at 
individual  grant  programs,  lo  sec  whether  any  of  them  seemed 


'the  nicthoiloloKY  used  w.is  lo  r.ink  .ill  counties  scqucnlially  by  sl/c 
of  per  capita  income,  and  to  select  loui  clusters  of  counties  at  regular 
intervals.  Two  counties  and  an  independent  city  had  to  be  eliminated 
because  examination  of  similar  items  in  the  various  data  bases  (e.j>.,  IRS 
exemptions  and  Census  population)  led  us  to  suspect  codinR  problems  in 
defining  these  areas.  A  list  of  the  counties  used  will  be  supplied  on 
request  by  the  authors. 
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dependent  on  average  income.  Table  2  shows  the  results  we 
achieved  for  eight  of  the  largest  grant  programs.  There  was  a 
strong  negative  correlation  between  income  and  the  aid  to 
educationally  deprived  children  as  well  as  the  food  stamp 
program. 

Somewhat  surprising  was  the  low  correlation  between  in- 
come and  revenue  sharing  allocations-especially  since  average 
income  is  one  of  the  factors  used  in  calculating  revenue  sharing 
disbursements.  However,  various  other  factors  in  the  formula 
work  to  reduce  the  effect  of  per  capita  income  arc: 

1 .  Not  only  level  of  income,  but  the  local  tax  effort  being 
made  by  the  county  relative  to  its  income  level  is  a  deciding 
factor. 

2.  Tax  effort  and  income  level  determine  only  which 
proportion  of  the  State's  allocation  goes  to  the  individual 
county.  However,  the  State's  share  of  the  revenue  sharing  is 
determined  by  its  overall  per  capita  income  as  well  as  its 
own  tax  effort. 


Thus,  a  poor  county  in  a  relatively  affluent  Stale,  <h  one 
making  a  poor  tax  effort,  may  get  a  large  share  of  that  State's 
allocation— but  it  may  still  not  do  as  well  as  a  similarly  poor 
county  in  another  State.  This  is  particularly  true  since  no 
county's  share  may  exceed  145  percent,  or  be  less  than  20 
percent,  of  the  average  per  capita  allocation  going  to  the 
counties  in  that  State. 

Not  only  was  the  correlation  coefficient  for  the  Compre- 
hensive Employment  and  Training  Act  (CETA)  very  low,  it  was 
positive.  At  least  two  factors  appear  to  be  involved:  First,  the 
program  was  just  starting  during  fiscal  year  1974,  and  a  large 
proportion  of  the  counties  in  the  study  had  entries  of  zero  for 
this  program.  It  is  likely  that  the  governments  of  the  poorest 
counties  took  longer  to  get  into  the  program  than  those  of  the 
more  affluent  counties. 

Second,  there  arc  special  allocation  problems  associated  with 
this  program,  since  many  of  the  grants  go  to  groups  of  counties 
which  have  formed  "consortia." 

Conclusions 


Table  2.   Correlation  of  Per  Capita  Federal  Taxes  and 
Grants  With  Per  Capita  Money  Income 


Correlation 

coefficient 

Item 

for  sampled 
counties 

Federal  individual  income  tax 

+.89758 

Total  Federal  grants 

-.20208 

Major  Federal  grant  programs: 

Food  stamp  bonus  coupons* •• 

-.74663 

Educationally  deprived  children 

-.75657 

Medical  assistance  program. 

-.64388 

Public  assistance-maintenance 

assistance  State  aid).... 

-.41949 

Community  development  block 

grants - 

entitlement  grants 

-1-  .19218 

CETA  Title  I- -comprehensive 

manpower 

services 

+.15928 

Fiscal  assistance  to  State  and  local 

Governments  'revenue  sharing) 

-.02844 

Highway  planning  and  construction 

-.52001 

In  conclusion,  we  would  like  to  emphasize  that  we  have  only 
scratched  the  surface  of  these  very  rich  sources  of  county  data 
that  have  become  available  as  a  result  of  the  revenue  sharing 
program.  We  would  have  liked  to  expand  the  study  to  cover  all 
counties.  We  would  like  to  try  and  fit  something  other  than 
straight  line  functions  to  these  data.  To  make  all  these  studies,  it 
will  obviously  be  necessary  to  get  the  data  in  machine-readable 
form.  In  this  connection,  we  are  happy  to  note  that  the  IRS  is 
planning  to  make  the  summarized  tax  return  data  available  on 
computer  tape  through  the  National  Technical  Information 
Service  in  the  very  near  future  (see  reference  7).  The  data  on 
Federal  outlays  are  currently  available  on  tape  through  the 
National  Archives  (see  reference  8). 

Beyond  expanding  this  study,  however,  what  is  really  needed 
is  someone  to  pull  together  many  more  of  the  rich  data  bases  on 
counties  which  have  become  available  in  recent  years.  The 
extensive  demographic  information  in  the  Census  Bureau's 
County -City  Data  Book  tape,  as  well  as  administrative  data  from 
Social  Security  Administration;  the  Veterans'  Administration; 
and  the  files  discussed  in  this  paper,  should  be  part  of  this  data 
base. 

In  short,  the  model  should  combine  in  one,  easily  accessible 
file  all  items  needed  to  compute  the  various  great  allocation 
formulas,  and  to  evaluate,  after  the  fact,  how  well  those 
formulas  have  worked. 
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Methodology  Issues: 
Comments  on  Papers 

Wray  Smith 

Department  of  Health,  Education,  and  Welfare 

To  discuss  nine  difterenl  papers  in  15  minutes  is  an 
ambitious  task  and  we  will  not  attempt  to  relate  all  points  in 
those  papers  to  the  conference  theme.  Instead,  we  shall 
concentrate  on  the  main  aspects  and  interrelationships  of 
small-area  statistics. 

Concerning  estimates,  projections,  and  other  related  things, 
there  is  always  the  question  of,  "who  is  in  charge  here?"  It  is 
not  certain  that  we  have  a  clear  focus  on  the  domain  of 
small-area  statistics,  even  though  the  Office  of  Management  and 
Budget  (0MB)  has  promulgated  some  procedural  directives  in 
great  detail.  In  the  struggle  of  producing  estimates  and 
projections,  we  may  very  well  end  up  adopting  a  kind  of 
"lowest  common  denominator"  or  "least-worst"  method  but 
may  not  be  able  to  cope  well  pragmatically  with  others.  In  his 
paper,  Engel  referred  to  the  need  for  getting  estimates  out  early, 
which  leads  to  provisional,  interim,  and  revised  estimates,  and, 
as  a  consequence,  some  confusion  among  users.  This  state  of 
affairs  is  inevitable-even  necessary-since  there  is  always  a 
strong  demand  for  early  release  of  "quick  and  dirty"  estimates. 
Perhaps  such  estimates  may  be  updated  by  better  estimates 
later.  But  the  risk  to  the  user  is  that  two  sets  of  numbers  are 
circulated  and  that  this  may  be  somewhat  confusing.  It  might  be 
less  confusing,  if,  instead  of  labeling  these  estimates  preliminary 
and  revised,  we  would  treat  them  as  different  series— and  some 
of  the  estimates  described  earlier  really  were  that-with  each 
having  a  life  and  life  span  of  its  own.  Perhaps  we  need  to  make 
such  series  more  distinct,  along  the  lines  of  the  "M"  series 
(money  supply  series  and  also  the  GNP,  from  the  National 
product  accounts)  the  Federal  Reserve  Board  has  for  a  number 
of  different  kinds  of  estimates,  all  involving  money. 

The  fact  that  one  can  get  different  counts  on  variables  which 
have  not  been  controlled  was  referred  to  by  Engels  and  Gerson 
in  reference  to  the  survey  of  income  and  education  (SIE).  If 
one  reports  on  variables  which  are  controlled  for  age,  for  ex- 
ample, as  well  as  on  others  which  are  free  floating  and  then 
compares  estimates  of  the  free  floating  variables  from  two 
surveys  (or  between  a  survey  and  administrative  records),  one 
has  the  obligation  to  warn  the  users  of  the  statistics  so  that  they 
will  not  be  surprised  if  these  figures  are  different.  As  a  matter 
of  fact,  it  would  be  astonishing  if  they  proved  to  be  otherwise. 

There  is  the  issue  of  the  best  extimates  from  an  estimates 
program  in  contrast  with  the  best  estimates  from  one  or 
more  surveys.  This  issue  is  being  encountered  more  often 
these  days.  For  example,  note  the  comparisons  from  the 
current  population  survey  (CPS)  and  (SIE)— sometimes  it 
appears  appropriate  to  use  estimates  from  one  and  some- 
times from  the  other.  If  data  from  these  sources  are  com- 
pared with   data  from    administrative   records,  we  then  have 


two  sets  of  differences.  For  example,  if  one  has  collected  and 
processed  personal  income  figures  from  the  Bureau  of  Economic 
Analysis  (BEA)  and  has  no  other  figures  to  compare  them  to, 
one  will  tend  to  use  data  from  other  scries  for  comparison.  The 
CPS  found  itself  in  thai  kind  of  bind  until  the  advent  of  the 
SIE.  Those  of  us  who  expected  a  definitive  comparison  found, 
instead,  that  we  had  a  variety  of  factors  which  affected  the 
differences.  We  did  not  have  objective  criteria  to  resolve  these 
statistical  perturbations.  Thus,  we  did  not  dare  to  assert  that  the 
CPS  was  no  good.  Instead,  the  evidence,  shows  that,  in  general, 
we  had  a  somewhat  better  measure  of  poverty  and  money 
income  in  the  SIE.  We  cannot  say,  however,  that  it  would 
necessarily  happen  again  this  year  if  wc  repeated  the  SIE. 

Small-area  statistics  in  a  way  comprise  a  "Vietnam  of 
statistics."  We  get  in  easily,  but  we  don't  know  how  to  get  out. 
We  have  all  those  problems,  and  we  realize  that  they  are 
important  problems,  but  we  cannot  and  must  not  resort  to 
academic  kinds  of  arguments  to  settle  them.  There  are  presently 
no  neat,  "true"  indicators,  which  could  tell  us  which  is  a  "high 
quality"  data  set  and  which  is  not. 

in  Coleman's  remarks  there  seemed  to  be  a  notion  that 
administrative  records  are  more  reliable  than  survey  data 
because  an  agency  has  reviewed  the  records,  however,  this 
assumption  is  debatable.  In  general,  the  methods  and  standards 
of  administrative  recordkeeping  may  be  worse  than  those 
applied  in  well  conducted  surveys.  We  should  become  very 
cautious  when  we  are  driven  to  the  use  of  administrative  records 
or  when  one  proposes  to  use  administrative  records,  as  in  the 
case  of  personal  income  data. 

HEW  and  the  Bureau  of  the  Census  are  developing  a 
proposed  new  survey  of  income  and  program  participation  for 
the  1980's.  But  no  one,  in  his  wildest  dreams,  would  expect 
such  a  survey  to  be  large  enough  to  produce  data  at  the  county 
level.  Presently  we  are  thinking  of  collecting  data  at  the  State 
level  every  5  years,  and  then  only,  if  we  can  get  enough  money 
to  fund  such  and  endeavor.  So,  if  one  wishes  to  go  down  to  the 
county  level,  one  has  to  use  other  kinds  of  sources  which  are 
often  unsatisfactory.  Then  comes  the  question,  to  what  degree 
can  we  trust  these  sources?  In  this  respect  it  is  important  to 
note  Engels'  table  3,  where  he  compares  areas  in  which  there 
appeared  to  be  a  difference  between  the  results  of  the 
estimation  method  and  the  results  produced  by  special  censuses. 
Most  of  these  areas  were  atypical  because  we  are  usually  dealing 
with  a  fast-growing  area,  when  we  take  a  special  census.  The 
differences  noted  by  Engels  do  not  usually  characterize  a  whole 
universe.  Such  differences  usually  develop  because  of  the 
uniqueness  and  peculiarity  of  the  relevant  areas. 

Herriot  reiterated  and  reinforced  our  understanding  of  the 
complexity  of  the  methodological  procedures  one  is  faced  with 
when  doing  things  like  the  per  capita  income  computations  for 
revenue  sharing.  Herriot's  comments,  and  other  comments  as 
well,  lead  to  the  trite  but  sound  principle:  Whatever  your 
methodology— describe  it  and  expose  it  for  others  to  review 
critically.  Don't  just  keep  it  in  a  backroom  until  some  angry, 
interested  party,  such  as  a  Congressman,  or  a  court  suit  makes 
you  reveal  your  method.  Especially  in  the  area  of  small-area 
statistics  we  must  be  pragmatic  and  must  not  expect  uniformity 
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across  the  country.  We  must  use  the  data  where  we  can  find  it, 
and  involve  everyone  in  an  evaluative  dialogue  concerning  the 
best  procedures. 

Hcrriot's  paper  generated  a  little  puzzlement  over  the 
following:  If  one  has  a  weighted  average  of  a  regression,  and 
then  some  sample  estimates,  what  is  the  applicable  theory  for 
characterizing  how  good  the  combined  estimate  is?  What  arc  the 
right  criteria  and  rules  in  terms  of  the  respective  errors? 

Another  issue  of  great  importance  is  the  standardization  of 
survey  items.  The  fact  that  we  had  46  questions  in  common  on 
the  SIE  and  the  March  CPS  gave  a  better  basis  for  comparison 
than  if  there  had  been  a  different  set  of  income  questions.  It  left 
us  a  little  less  perplexed  in  respect  to  the  difference  in  the 
estimates  than  if  we  had  used  different  questions.  We  must 
sacrifice  some  of  the  special  purposes  for  a  given  survey,  to  gain 
uniformity  of  training,  and  instructions  for  the  interviewers. 

Troob's  topic  was  familiar  to  me.  But  I  found  it  refreshing  in 
terms  of  his  treatment  of  the  political  issues.  I  agree  with  him 
that  the  selection  of  measures  is  as  much  a  political  choice  as  a 
scientific  one.  Again,  there  arises  the  question  of  authority  and 
responsibility.  Bui  Troob  put  it  all  in  terms  of  Congress.  It 
seems  to  me  that  the  behavior  he  described  was  the  kind  that 
one  faces  also  with  decisionmakers  in  the  Executive  branch.  If 
someone  is  forced  to  choose  on  a  practical  basis  there  is  only 
going  to  be  so  much  patience  for  input  from  statistical  and 
other  technical  experts.  For  example,  when  it  comes  to  the 
revenue  sharing  formula,  one  can  choose  the  better  of  two 
formulas,  which  is  awkard.  It  may  be  more  practical  to  give  the 
States  a  hold-harmless  provision.  For  example,  one  may  lake  the 
last  two  years'  estimates  and  then  select  cither  the  higher  or 
lower  of  the  two  estimates  depending  on  the  particular  purpose. 
That  may  not  be  dumb  politics,  and  it  may  not  even  be  bad 
statistics,  as  long  as  one  tells  everyone  exactly  what  one  is 
doing.  To  iterate  the  computations  may  be  clumsy,  but  that  is  a 
problem  of  aesthetics  (convenience),  but  not  of  statistics. 

The  interrelationship  between  surveys  and  administrative 
records  is  much  more  with  us  than  it  used  to  be.  Ziegler's  paper 
emphasized  the  growing  importance  of  unemployment  estima- 
tion, especially  since  the  CETA  legislation,  and  certain  other 
events,  including  the  new  proposal  on  welfare  reform.  We  can 
anticipate  the  question  of  how  to  target  the  special  public 
service  jobs  for  which  the  immediate  problem  concerns  a 
reasonable  basis  for  allocating  to  jurisdictions.  The  lack  of 
uniformity  in  State  administrative  recordkeeping  procedures 
and  definitions  makes  a  quality  check  or  audit  next  to 
impossible.  If  two  States  do  not  use  the  same  definitions  of 
common  terms  we  have  a  double  adjustment  problem. 


Another  problem  is  the  legislative  mandate  for  monthly  data. 
To  quote  Troob:  "Congress  can  do  a  lot  of  things,  but  why  did 
they  do  thatV  There  would  be  better  quality  and  a  more 
equitable  allocation  of  funds  if  there  were  a  retreat  from  this 
specification  for  monthly  data.  The  legislators  intended  these 
specifications  to  stimulate  responsiveness.  But  the  idea  of 
responsiveness  seems  to  have  deteriorated  to  "get  the  monthly 
data  in."  In  the  final  analysis  we  have  to  use  the  data  as  they 
perforce  arc  available,  even  if  we  cannot  evaluate  their  quality. 

Schiedel's  paper  on  county  business  patterns  offered  the 
welcome  prospect  of  total  quarterly  and  annual  payroll.  Here,  as 
elsewhere,  there  are  many  conceptual  problems  which  need  to 
be  dealt  with.  Because  of  confidentiality  issues  with  smaller 
firms,  we  will  still  have  the  problem  of  unreliability  of  published 
data  resulting  from  the  truncation  effect  at  the  lower  end  of  the 
distribution  of  payroll  data  because  firms  are  missing  from  the 
distribution.  But  the  data,  despite  their  shortcomings,  would  be 
available  to  the  Census  Bureau.  Then  special  analyses  could  be 
done,  while  protecting  the  identity  of  the  companies.  When 
possible  changes  in  the  funding  of  the  Social  Security  trust  fund 
were  discussed  recently,  one  of  the  questions  was,  what  would 
the  burden  be  for  small  employers  with  10  employees  or  less? 
We  scurried  around  and  we  found  that  such  data  were  not 
readily  available.  However,  such  data  would  be  of  great  interest 
or  at  least  some  special  analyses  of  such  data. 

The  paper  by  Hicks  and  Sailer  relates  quite  closely  to  our 
central  topic,  though  at  first  glance  one  might  not  think  so. 
The  relationship  between  taxation  and  what  the  government 
returns  to  the  jurisdictions  by  way  of  allocations  is  a  very  live 
issue,  and  has  important  bearing  on  the  question  of  "equity."  In 
the  discussions  of  the  welfare  reform  proposal  we  have  been 
very  much  aware  of  the  interest  in  who  pays  and  who  benefits. 
Who  benefits  is  further  subdivided  into:  What  kind  of  people 
benefit  and  what  kind  of  jurisdictions  benefit.  The  Federal 
benefits  to  the  States  have  a  dynamic  range  of  five  to  one 
between  the  Federal  share  in  high  benefit  States  versus  the 
Federal  share  in  low  benefit  States,  simply  because  it  is  a 
matching  formula  rather  than  an  u  priori  allocation.  If  you  start 
from  there  and  try  to  reform  something  by  means  of  a  formula, 
you  can  be  sure  to  encounter  some  difficulties. 

In  respect  to  Charles  Troob's  comments  that  the  SIE  may  not 
be  used  for  Title  I  of  the  ESEA  act,  it  is  worth  mentioning  that 
the  SIE  has  been  the  basis  for  the  welfare  reform  cost-and- 
coverage  estimates  and  for  evaluating  the  question  of  "better- 
offness"  and  "worsc-offness,"  which  is  what  everyone  is 
basically  concerned  about  when  talking  about  small-area  data. 
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